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Abstract 


A  SINGLE-PHASE  METHOD  FOR  QUADRATIC  PROGRAMMING 


Stephen  Carey  Hoyle,  Ph.D. 
Stanford  University,  1986 


This  thesis  will  describe  ,a  single-phase  quadratic  programming  (QP)  method. 
A  need  for  such  a  method  is  apparent  when  solving  a  sequence  of  closely-related 
QP’s.  One  example  of  this  is  in  the  area  of  nonlinear  programming,  which  uses  se¬ 
quential  quadratic  programming.  Another  example  is  sensitivity  analysis  involving 
the  constraints.  In  both  of  these  examples,  if  the  quadratic  problems  are  large-scale 
and  the  initial  point  is  infeasible,  the  single-phase  method  of  this  thesis  may  be 
particularly  useful. 

If  the  Hessian  is  indefinite,  the  method  will  find  a  local  minimum  when  solving 
a  dense  QP,  and  a  constrained  stationary  point  when  solving  a  large-scale  QP.  If 
the  Hessian  is  positive  definite,  it  will  find  the  global  minimum.  Additionally,  if  the 
QP  is  known  to  be  positive  definite,  the  method  converges  more  quickly  than  if  the 
QP  were  treated  as  a  general  indefinite  problem. 

The  single-phase  method  of  this  thesis  is  an  active-set  method  which  solves 
a  sequence  of  equality-construnt  quadratic  programs  (EQP).  It  differs  from  other 
active-set  methods  in  that  the  current  iterate  may  violate  constraints  in  the  working 
set  (the  prediction  of  the  active  set).  Special  care  has  been  taken  to  avoid  rank 
deficiency  in  the  working  set.  An  efficient  means  of  solving  the  successive  EQP’s  of 
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a  large-scale  problem  is  included. 

Finally,  an  implementation  of  the  method  for  dense  QP’s  is  described.  The 
implementation  includes  several  new  strategies  for  deleting  constraints  from  the 
working  set.  Computational  results  of  the  implementation  are  discussed.  In  partic¬ 
ular,  when  solving  positive-definite  QP’s  in  which  the  dimension  of  the  null  space 
at  the  solution  was  about  one  half  the  number  of  variables,  the  implementation 
required  approximately  one  half  as  many  EQP  solves  as  did  an  efficient  two-phase 
algorithm. 
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Chapter  One 
Introduction 


This  thesis  will  present  and  describe  the  implementation  of  a  single-phase 
quadratic  programming  (QP)  method.  A  single-phase  method  works  towards  sat¬ 
isfying  optimality  and  feasibility  conditions  simultaneously.  Many  QP  methods  are 
two-phase,  since  they  first  find  a  feasible  point,  and  then  an  optimal  point.  Each 
QP  method  must  ascertain  if  the  Kuhn-Tucker  conditions  are  satisfied.  This  can 
be  determined  by  solving  the  augmented  system  of  linear  equations,  which  state 
the  Kuhn-Tucker  optimality  and  feasibility  conditions.  If  a  method  solves  the  aug¬ 
mented  system  without  transformation,  then  it  is  called  a  Lagrangian  method  (see 
Gould,  1984).  If  a  method  transforms  the  augmented  system  into  several  smaller 
systems,  then  it  is  called  a  projection  method. 

When  the  QP  problem  is  dense,  there  may  be  no  overwhelming  reason  to  pre¬ 
fer  a  single-phase  method  to  a  two-phase  method,  or  a  Lagrangian  method  to  a 
projection  method.  This  is  not  true  in  a  large-scale  problem.  There  is  a  fundamen¬ 
tal  objection  to  using  a  projection  method  on  a  large-scale  problem.  The  smaller 
linear  systems  in  a  projection  method  involve  forming  products  of  sparse  matrices. 
Products  of  sparse  matrices  are  generally  not  sparse.  Thus,  the  projection  methods 
may  be  computationally  expensive  since  they  cannot  use  sparse  matrix  techniques. 
Lagrangian  methods  do  not  destroy  the  inherent  sparseness  of  the  problem,  since 
the  augmented  system  is  solved  without  transformation. 

There  are  several  objections  to  using  a  two-phase  Lagrangian  method  on  a 
large-scale  problem.  First,  different  code  is  usually  required  for  each  of  the  two 
phases.  Second,  the  factorizations  calculated  in  the  first  phase  cannot  usually  be 
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used  in  the  second  phase.  The  work  to  compute  these  factorizations  is  typically 
a  large  portion  of  the  total  computational  effort.  Third,  a  two-phase  (projection 
or  Lagrangian)  method  ignores  the  objective  function  during  the  first  phase.  This 
implies  that  even  though  the  initial  infeasible  point  may  be  “near”  optimal,  the 
first  phase  may  calculate  a  feasible  point  that  is  *far”  from  optimal.  In  a  single¬ 
phase  Lagrangian  method,  the  first  two  objections  do  not  apply.  By  working  towards 
optimality  and  feasibility  simultaneously,  a  single-phase  method  should  diminish  the 
third  objection.  Thus,  a  single-phase  Lagrangian  method  may  be  computationally 
preferable  when  solving  a  large-scale  QP  problem.  The  purpose  of  this  thesis  is 
to  present  a  single-phase  method  which  can  use  a  Lagrangian  method  to  solve  the 
augmented  system. 

There  are  two  main  reasons  for  developing  this  method.  First,  its  computational 
attractiveness  alone  makes  it  worthwhile  to  study  as  a  different  QP  method.  Second, 
a  need  for  such  a  method  is  apparent  when  solving  a  sequence  of  closely-related  QP’s. 
One  example  of  this  is  in  the  area  of  nonlinear  programming  (NLP),  which  uses 
sequential  quadratic  programming  (SQP).  Another  example  is  sensitivity  analysis 
involving  the  constraints.  In  both  of  these  examples,  if  the  quadratic  problems  are 
large-scale  and  the  initial  point  is  infeasible,  the  single-phase  method  of  this  thesis 
may  be  particularly  useful. 

Some  QP  methods  will  accommodate  an  mdeGnite  Hessian.  Others  restrict 
themselves  to  the  subset  of  quadratic  problems  with  positive  definite  Hessians.  If 
the  Hessian  is  indefinite,  the  single-phase  method  of  this  thesis  will  find  a  local 
minimum  when  solving  a  dense  QP,  and  a  constrained  stationary  point  when  solving 
a  large-scale  QP.  If  the  Hessian  is  positive  definite,  it  will  find  the  global  minimum. 
Additionally,  if  the  QP  is  known  to  be  positive  definite,  the  method  converges  more 
quickly  than  if  the  QP  were  treated  as  a  general  indefinite  problem. 


Rank  deBciency  in  the  working  set  is  not  a  theoretical  problem  in  most  QP 
methods.  However,  because  the  single-phase  method  presented  here  defines  the 
working  set  differently,  rank  deficiency  is  possible.  Thus,  the  management  of  this 
rank  deficiency  is  an  important  part  of  the  single-phase  method. 

Chapter  Two  describes  the  basic  steps  of  an  active-set  method.  It  concludes 
with  a  description  of  several  alternative  active-set  QP  methods. 

Chapter  Three  presents  the  single-phase  active-set  method  of  this  thesis.  Before 
presenting  the  method,  it  describes  a  general  strategy  to  maintain  nonsingularity 
in  the  augmented  system.  Also  discussed  is  an  efficient  method  for  solving  the 
successive  augmented  systems  of  a  large-scale  QP  problem. 

In  Chapter  Four,  a  single-phase  method  for  dense  QP’s  {QPSFA)  is  described. 
{QPSFA  is  derived  phonetically  from  single-phase  algorithm  for  QP’s)  The  descrip¬ 
tion  explains  how  the  required  factorizations  are  updated,  and  how  the  method  han¬ 
dles  an  indefinite  Hessian.  Also  described  are  various  constraint  deletion  strategies 
to  be  tested.  The  last  section  compares  QPSFA  to  the  alternative  QP  algorithms 
described  in  Chapter  Two. 

Chapter  Five  presents  computational  results  of  the  implementation  of  QPSFA. 
It  begins  with  a  description  of  how  the  quadratic  programs  are  randomly  generated. 
Then  the  different  constraint  deletion  strategies  of  QPSFA  are  compared.  Finally, 
the  implementation  is  compared  to  an  alternative  QP  algorithm. 


Chapter  Two 

Active-Set  Quadratic  I*rograininiiig 


2.1.  Overview  of  an  active-set  method 

The  quadratic  programming  problem  is  assumed  to  be  stated  in  the  following  form: 


QPl  minimize  c^x  -i- 

xeS"  2 

subject  to  Ax  =  b,  x>  0, 

where  c  is  a  constant  n-vector,  If  is  a  constant  n  x  n  symmetric  matrix,  and  A  is  an 
my.  n  matrix.  The  constraints  involving  A  will  be  called  general  constraints;  the 
remaining  constraints  will  be  called  bounds.  The  vector  g  =  c  +  Ux  will  denote  the 
gradient  of  the  quadratic  objective  function.  The  vector  r=Ax-b  will  denote  the 
residual  of  the  general  constraints. 

The  method  presented  in  this  thesis  is  an  active-set  method.  The  active  set  is 
the  set  of  constraints  that  are  satisfied  exactly  at  the  solution.  Briefiy,  an  active- 
set  method  is  an  iterative  process.  At  the  start  of  each  iteration,  the  method 
maintains  a  prediction  of  the  active  set.  This  prediction  is  called  the  working  set. 
The  method  determines  if  the  prediction  is  correct.  If  it  is  correct,  the  method 
concludes  successfully.  If  it  is  not  correct,  a  search  direction  is  constructed  and  a 
step  length  along  the  search  direction  is  calculated. 

The  constraints  in  QPl  can  be  written  in  other  equivalent  forms.  The  form 
used  here  was  chosen  for  several  reasons.  First,  the  general  constraints  are  always 
in  the  working  set,  so  changes  to  the  working  set  will  consist  only  of  adding  and 
deleting  bounds.  This  will  make  the  description  of  the  method  simpler.  Second, 


Section  2.1. 


Overview  of  an  active-aet  method  5 


since  only  bounds  in  the  working  set  can  be  changed,  we  shall  primarily  be  working 
with  the  columns  of  A.  In  a  large-scale  problem,  where  many  columns  of  A  are 
columns  of  the  identity  matrix  and  as  such  are  not  stored  explicitly,  it  is  easier  to 
work  with  the  columns  than  the  rows  of  the  constraint  matrix. 

The  search  direction  is  constructed  so  that  the  general  constraints  are  exactly 
satisfied  at  a  step  length  of  one,  and  the  bounds  in  the  working  set  are  always 
satisfied.  For  a  bound  constraint  in  the  working  set,  this  construction  is  achieved 
by  setting  the  corresponding  component  of  the  search  direction  to  zero.  Thus,  the 
associated  variable  is  fixed,  and  specification  of  the  working  set  induces  a  partition 
of  X  into  fixed  and  free  variables.  Let  the  subscripts  “fr"  and  “fx"  denote  a 
vector  or  matrix  whose  elements  are  associated  with  the  free  and  fixed  variables, 
respectively.  During  a  given  iteration,  the  fixed  variables  are  effectively  removed 
from  the  problem.  Hence,  we  let  p  denote  the  search  direction  with  respect  to  the 
free  variables  only. 

Let  a  denote  the  step  length.  Once  the  search  direction  and  step  length  are 
specified,  the  new  iterate,  I,  is  defined  by 

£pii  =  Xfn  H-  ap. 

Due  to  the  quadratic  nature  of  the  objective  function,  there  are  only  two  choices 
of  step  length.  A  step  of  unity  along  p  is  the  exact  step  to  the  minimum  of  the 
quadratic  objective  function  restricted  to  the  working  set.  If  a  step  of  unity  can 
be  taken,  the  next  iterate  will  be  a  constrained  stationary  point  with  respect  to 
the  working  set,  and  exact  Lagrange  multipliers  can  be  computed  to  determine  if  a 
constraint  should  be  deleted.  Otherwise,  the  step  along  p  to  the  nearest  constraint 
is  less  than  unity,  and  a  new  constraint  will  be  included  in  the  working  set  at  the 
next  iterate.  Thus,  the  working  set  is  updated  by  possibly  adding  or  deleting  a 
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constraint  before  the  next  iteration  begins.  For  simplicity,  we  shall  always  consider 
a  typical  iteration  and  avoid  reference  to  the  index  of  the  iteration. 

Calculation  of  the  search  direction  will  now  be  considered.  Let  H  and  g  denote 
the  components  of  M  and  g,  respectively,  that  correspond  to  the  free  variables.  Let 
ripR  and  npx  denote  the  number  of  free  and  fixed  variables,  respectively.  Finally,  let 
C  denote  the  mx  ripn  submatrbc  of  A  corresponding  to  the  free  variables.  The  search 
direction  p  is  chosen  so  that  Xp^  +  p  is  the  solution  of  the  quadratic  programming 
problem  with  the  original  objective  function  restricted  to  the  free  variables,  subject 
to  the  condition  that  Zpr  +  p  exactly  satisfies  the  constraints  in  the  working  set. 
With  this  definition,  p  is  the  solution  to  the  following  equality-constraint  quadratic 
programming  problem: 

EQP  minimize  g'^p  -f  y’'Bp 

p  ^ 

subject  to  Cp  =  — r. 

The  optimality  conditions  for  the  EQP  state  the  existence  of  a  vector  of  La¬ 
grange  multipliers,  denoted  by  fi,  that  satisfy 

C^ls  =  g-i-Hp.  (2.1) 

The  above  feasibility  and  optimality  conditions  for  the  EQP  can  be  satisfied  by 
solving  the  augmented  system  of  equations  given  by 

for  p  and  /i.  The  major  differences  among  QP  methods  arise  from  the  numerical 
methods  that  solve  the  augmented  system,  and  the  strategies  that  control  changes 
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in  the  working  set.  Some  QP  methods  are  equivalent  in  the  sense  that  they  calcu¬ 
late  identical  sequences  of  iterates  when  applied  to  the  same  problem.  Thus,  two 
algorithms  which  may  appear  to  be  based  on  different  methods,  sometimes  differ 
only  in  the  numerical  method  used  to  solve  the  augmented  system.  (See  Djang, 
1980,  and  Best,  1985.)  Some  of  the  methods  for  solving  the  augmented  system  are 
given  in  Section  2.2.  Several  strategies  that  control  changes  in  the  working  set  are 
discussed  in  Section  2.3.  Following  this,  initialization  procedures  are  presented  in 
Section  2.4.  In  Section  2.5,  several  QP  algorithms  are  summarized — including  how 
they  solve  the  EQP,  change  the  working  set,  and  initialize  the  problem. 

2.2.  Solving  an  EQP 

We  shall  briefly  survey  several  methods  for  solving  the  augmented  system  (2.2). 
Lagrangian  or  Kubn-Tucker  methods  use  a  factorization  of  the  entire  matrbc  K 
in  (2.2)  to  solve  for  p  and  ft.  Projection  methods  for  solving  the  augmented  sys¬ 
tem  are  based  upon  breaking  down  the  augmented  system  into  several  (usually 
two)  simpler  systems.  The  factorizations  needed  to  solve  these  simpler  systems 
are  of  a  smaller  dimension  than  the  factorizations  used  in  the  Lagrangian  method. 
Projection  methods  for  solving  the  augmented  system  can  be  further  classified  as 
range-space  or  null-space  methods.  This  terminology  arises  because  the  working  set 
can  be  viewed  as  defining  two  complementary  subspaces:  the  range  space  of  vectors 
that  can  be  expressed  as  linear  combinations  of  the  rows  of  C,  and  the  null  space 
of  vectors  orthogonal  to  the  columns  of  C.  In  many  cases,  the  work  required  to 
solve  the  augmented  system  is  directly  proportional  to  the  dimension  of  either  the 
range  space  or  the  null  space.  For  example,  the  methods  of  Murray  (1971),  Gill 
and  Murray  (1978a),  Bunch  and  Kaufman  (1980),  and  Powell  (1981)  are  null-space 
methods,  and  are  more  efficient  when  the  number  of  constraints  in  the  working  set 
is  close  to  n,  since  the  dimension  of  the  null  space  is  then  relatively  small.  Range- 
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space  methods  increase  in  efficiency  as  the  number  of  constraints  in  the  working 
set  decreases.  For  an  example  of  a  range-space  method  see  Gill  et  ai.,  1982.  The 
differences  in  the  methods  arise  because  of  the  different  emphasis  placed  on  the 
relative  importance  of  storage  needs,  computational  efficiency  and  numerical  relia¬ 
bility.  When  solving  large-scale  problems,  these  considerations  favor  a  Lagrangian 
method,  since  a  projection  method  may  be  computationally  expensive  whenever  the 
number  of  active  constraints  is  neither  close  to  n  nor  to  zero. 

Several  examples  of  projection  methods  are  now  given  that  use  an  equivalent, 
but  simpler,  augmented  system.  Let  Si  and  S^  be  nonsingular  (nrii-f-m)x(nrR-l-m) 
matrices.  The  solution  of  (2.2)  is  equivalent  to  the  solution  of 

^‘(c  ""1  i-fl)  = (r)  ’ 

A  particular  range-space  method  follows  when  Si  is  defined  by 


and  5a  is  the  identity  matrix.  The  augmented  system  can  then  be  written  as 

From  (2.4),  the  following  equations  for  p  and  /i  are  obtained: 

CH-^C^H  =  CH-^g-r,  (2.5o) 

and 

Bp  =  C^n  -  g.  (2.56) 


In  order  to  solve  (2.5),  factorizations  of  H  and  CH~^C^  are  required. 
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The  remaicder  of  the  methods  presented  in  this  section  will  be  null-space  meth¬ 
ods.  One  choice  for  Si  is  based  on  finding  an  m  x  m  reverse  triangular  matrix  T 
and  an  rim  x  ^m  matrbc  Q  (see  Gill  et  ai.  1982)  satisfying 

CQ  =  (0  T).  (2.6) 

Assume  that  the  columns  of  Q  are  partitioned  so  that 


Q  =  iz  Y), 


where  Z  is  an  x  (n^R  —  m)  matrix  and  Y  is  an  n^R  x  m  matrix.  Define  S2  to 
be  the  matrix 

(^.7, 

and  define  Si  as  the  transpose  of  S^.  Let  px  and  py  denote  the  first  Urr  -  m  and 
last  m  elements  of  p,  respectively.  Using  (2.7)  in  (2.3)  yields 


Z'^HZ  Z'^HY 
YThz  Y'^HY  7’^' 
T 


Pz 

Pr  I  = 
-f*. 


r 


Thus,  p  and  n  may  be  found  by  successively  solving  the  equations 


and 


(2.8) 


1 

II 

>• 

(2.9a) 

Z^HZpx  =  -Z^g  -  Z^HYpr, 

(2.96) 

P=Ypy  +Zpx, 

(2.9c) 

=  Y^{g  Hp). 

(2.9J) 

In  order  to  solve  (2.9),  a  TQ  factorization  of  C  and  a  factorization  of  Z^HZ  (the 
projected  Hessian)  are  required. 
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When  H  is  positive  definite,  another  method  may  be  defined  using  a  different 


choice  for  the  matrix  Q  in  (2.6).  Suppose  that  Q  satisfies  (2.6),  and 

Q’^HQ  =  I.  (2.10) 

In  this  case,  we  have  Z'^HZ  =  I  and  Z'^HY  =  0.  The  equations  (2.9)  for  p  and  n, 
therefore,  simplify  to  become 

Tpr  =  -r, 

Pt  =  -Z'^g, 

p  =  Ypy  +Zpz, 

and 

T^fi  =  Y^g  +  Py.  (2.1  Id) 

Equations  (2.11)  are  used  to  solve  the  augmented  system  in  the  QP  method 
of  Goldfarb  and  Idnani  (1983).  Similar  techniques  have  been  suggested  for  both 
the  positive-definite  and  indefinite  cases  in  which  the  relationship  Z^HZ  =  Z?  is 
maintained  instead  of  (2.10).  The  matrix  D  may  be  diagonal  (see  Murray,  1971)  or 
block-diagonal  (see  Bunch  and  Kaufman,  1980). 

If  the  matrix  C  has  unit  column  vectora  (i.e.,  vectors  tj  with  the  j-th  component 
equal  to  one,  and  the  remaining  components  equal  to  zero),  this  special  structure 
can  be  used  to  solve  the  augmented  system  more  efficiently.  An  example  of  this 
special  structure  is  when  there  are  slack  variables  in  the  QP.  Suppose  C  has  an 
additional  n«  columns  that  correspond  to  free  slack  variables.  The  m  x  (rim  +  ris) 
matrix  C  can  be  written  as 

^-(a:  -/)■ 

where  the  identity  corresponds  to  the  free  slack  variables.  The  components  of  M 


(2.11a) 

(2.116) 

(2.11c) 
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and  then  set  A  =  0  and  q  =  Atp  +  r^.  The  smaller  augmented  system  (2.14)  can  be 
solved  by  any  one  of  the  methods  described  earlier. 

2.S.  Changes  to  the  working  set 

Two  strategies  that  control  changes  to  the  working  set  are  now  briefly  outlined. 
Let  X  denote  the  current  iterate  and  A  the  multipliers  of  the  constraints  in  the 
current  working  set.  The  strategies  described  here  always  attempt  to  move  from 
the  minimum  on  one  set  of  constraints  to  the  minimum  on  another  by  taking  steps 
of  the  form  p  and  /i  —  A.  The  maintenance  of  certain  conditions  on  the  working 
set  causes  a  shorter  step  ap  and  a{fi  —  A),  where  a  (0  <  o  <  1)  depends  upon  the 
active-set  strategy  being  used. 

The  first  strategy  discussed  is  representative  of  the  class  of  active-set  feasible- 
point  methods.  In  this  case,  i  is  feasible  (i.e.,  all  r,  >  0)  but  is  not  dual  feasible 
(i.e.,  there  exists  some  constraint  for  which  Ay  <  0).  Changes  in  the  working  set  are 
designed  to  maintain  feasibility  with  respect  to  the  constraints  and  to  move  interior 
to  constraints  that  have  negative  Lagrange  multipliers.  At  a  typical  iteration,  C 
comprises  the  constraints  that  are  satisfied  exactly  at  z  (i.e.,  r  =  0  in  (2.2)).  The 
step  length  a  is  equal  to  one  if  z  -I-  p  is  feasible  (i.e.,  ry  -f-  ojp  >  0).  Otherwise,  a  is 
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the  largest  step  such  that  ry  +  aajp  =  0  for  an  index  j  of  a  constraint  that  is  not 
in  the  working  set.  A  constraint  is  added  to  the  working  set  if  its  residual  becomes 
sero  at  the  step  a.  A  constraint  with  Ay  <  0  is  selected  to  be  deleted  from  the 
working  set  after  a  unit  step.  (The  active-set  method  outlined  briefly  in  Section  2.1 
is  an  active-set  feasible-point  method.) 

The  second  strategy  forms  the  basis  of  the  class  of  dual-feasible  active-set  meth¬ 
ods.  In  these  methods,  x  is  not  feasible  (i.e.,  some  ry  <  0)  but  is  always  dual  feasible 
(i.e.,  all  Ay  >  0).  Changes  in  the  working  set  are  designed  to  maintain  non-negative 
multipliers  while  moving  to  satisfy  the  violated  constraints.  The  step  a  is  equal  to 
one  if  I  -f  p  is  dual  feasible  (i.e.,  fij  >  0).  Otherwise,  a  is  the  largest  step  such  that 
Ay  -f  a(/iy  -  Ay)  =  0  for  a  constraint  index  j  that  is  in  the  working  set.  A  constraint 
is  deleted  from  the  working  set  if  its  associated  value  of  A  becomes  zero  at  the  step 
a.  After  a  unit  step  is  taken,  the  iterate  will  be  primal  feasible.  (If  the  constraints 
in  the  QP  were  in  an  equivalent  form  where  all  of  the  general  constraints  were  not 
necessarily  in  the  working  set,  then  after  a  unit  step  is  taken,  the  iterate  would 
satisfy  the  working  set.  Then,  a  constraint  with  ry  <  0  would  be  selected  to  be 
added  to  the  working  set.) 

For  both  of  these  active-set  strategies,  each  change  in  the  working  set  leads  to 
a  simple  change  to  C,  which  in  turn  leads  to  a  change  in  the  factorizations  being 
used  to  solve  (2.2).  Since  the  only  inequalities  in  QPl  are  bound  constraints,  only 
bounds  can  be  added  to  or  deleted  from  the  working  set.  These  changes  to  the 
working  set  lead  to  changes  in  the  columns  of  C.  (When  a  bound  corr^ponding  to 
a  variable  is  deleted  from  or  added  to  the  working  set,  we  may  shorten  the  wording 
and  say  that  the  variable  has  been  deleted  from  or  added  to  the  working  set.)  If  the 
only  inequalities  of  a  QP  were  general  constraints,  only  general  constraints  could 
be  added  to  or  deleted  from  the  working  set.  These  changes  to  the  working  set 
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would  lead  to  changes  in  the  rows  of  the  constraint  matrix  used  in  the  augmented 
system.  If  a  QP  had  both  bounds  and  general  constraints  as  inequalities,  both 
bounds  and  general  constraints  could  be  added  to  or  deleted  from  the  working 
set.  These  changes  to  the  working  set  would  lead  to  changes  in  both  the  rows  and 
columns  of  the  constraint  matrix  used  in  the  augmented  system. 

2.4.  Initialisation  procedures 

Both  of  the  active-set  strategies  described  in  the  previous  section  require  an  initial¬ 
ization  procedure  to  obtain  an  initial  primal-  or  dual-feasible  point.  It  is  critical 
that  this  procedure  be  able  to  utilize  a  pre- assigned  working  set.  In  this  way,  during 
later  major  iterations  of  an  SQP  method,  it  is  possible  to  force  the  QP  subproblems 
to  reach  optimality  in  a  single  iteration  because  the  optimal  active  set  is  available 
before  solving  the  subproblem. 

The  initialization  procedure  for  the  primal  feasible-point  method  is  equivalent 
to  a  linear  programming  problem.  Consider  the  sum  of  infeasibilities  v(p)  =  E|r,  |. 
Note  that  v(p)  is  a  linear  function  of  z  which  is  zero  at  any  feasible  point,  and  pos¬ 
itive  at  an  infeasible  point.  Therefore,  a  feasible  point  can  be  found  by  minimizing 
t'(p)- 

If  a  dual-feasible  active-set  strategy  is  used,  the  following  initialization  proce¬ 
dure  may  be  employed.  The  procedure  is  based  on  finding  a  subset  of  the  pre¬ 
assigned  working  set  on  which  the  multipliers  are  positive.  First,  the  minimum 
of  the  quadratic  function  on  the  pre-assigned  working  set  is  computed  by  solving 
(2.2).  If  the  fij  are  non-negative,  the  initial  point  is  given  by  x  and  X  =  p  and 
the  initialization  is  complete.  Otherwise,  a  constraint  with  a  negative  multiplier  is 
deleted,  the  factorizations  are  updated  and  (2.2)  is  solved  again.  This  process  is 
repeated  until  (t)  all  the  multipliers  are  positive,  in  which  case  z  and  X  =  p  define 
the  required  initial  point,  or  (it)  the  working  set  is  empty,  in  which  case  z  4-  p  is 
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the  unconstrained  minimum  and  is  trivially  dual  feasible. 

An  alternative  initialization  procedure  for  the  dual-feasible  active-set  method 
is  to  always  start  at  the  unconstrained  minimum  (if  one  exists)  and  give  preference 
to  adding  the  pre-assigned  constraints.  However,  if  the  pre-assigned  working  set 
is  similar  to  the  active  set,  this  scheme  is  likely  to  require  more  work  than  the 
procedure  above.  First,  more  operations  are  required  to  compute  the  factorizations 
by  updating.  Second,  even  if  the  pre-assigned  working  set  defines  the  optimal 
feasible  point,  the  number  of  QP  iterations  may  not  be  equal  to  the  dimension 
of  the  optimal  working  set  since  it  cannot  be  guaranteed  that  the  multipliers  will 
remain  dual-feasible  during  the  intermediate  iterations. 

2.S.  Alternative  QP  metbods 

The  single-phase  method  of  this  thesis  is  similar  to  other  QP  methods.  Brief  descrip¬ 
tions  of  several  of  these  methods  are  now  given,  in  order  to  facilitate  a  comparison 
of  them  to  the  single-phase  method.  Their  descriptions  will  include  the  method 
used  to  solve  the  EQP,  the  strategy  used  to  control  changes  in  the  working  set,  and 
the  initialization  procedure. 

2.6.1.  QPSOL:  a  primal-feasible  algorithm.  QPSOL  is  an  active-set  feasible- 
point  QP  algorithm  of  Gill,  Murray,  Saunders,  and  Wright.  In  QPSOL  the  Hessian 
is  allowed  to  be  indefinite  (see  Gill  e(  ai.,  1983a).  In  terms  of  solving  the  augmented 
system,  QPSOL  is  a  null-space  projection  method  that  uses  equation  (2.9).  Since  it 
is  a  feasible- point  method,  the  r  and  hence  py  terms  in  (2.9)  are  zero.  The  strategy 
used  by  QPSOL  that  controls  changes  in  the  working  set  is  described  in  Section 
2.3.  QPSOL  maintains  primal  feasibility  while  moving  to  a  constrained  stationary 
point.  Then,  a  constraint  with  a  negative  multiplier  is  deleted,  and  the  algorithm 
finds  a  different  constrained  stationary  point  with  a  lower  objective  value. 
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The  algorithm  is  also  a  two-phase  method  since  it  first  finds  a  feasible  point. 
In  order  to  perform  this  initialization  procedure,  QPSOL  minimizes  the  sum  of  in¬ 
feasibilities,  v(p),  that  was  described  in  Section  2.4.  It  minimizes  the  function  v(p) 
using  an  active-set  strategy  that  is  almost  identical  to  the  feasible-point  active- 
set  method.  The  principal  differences  are  that  the  search  direction  is  defined  as 
—ZZ^Vv(p)  and  the  largest  value  of  a  is  computed,  subject  to  the  restriction  that 
the  number  of  violated  constraints  is  not  increased.  This  gives  a  as  the  step  to  the 
nearest  of  the  satisfied  constraints.  Several  violated  constraints  may  become  satis¬ 
fied  during  a  single  iteration.  Notice  that  the  feasibility  phase  does  not  perform  the 
standard  simplex  method  (i.e.,  it  does  not  necessarily  find  a  vertex).  The  implemen¬ 
tation  of  this  procedure  reflects  the  similarity  of  the  linear  algebraic  computations 
associated  with  iterations  in  both  the  feasibility  and  QP  phases — in  particular,  each 
iteration  involves  an  update  of  the  same  factorization  of  the  working  set.  The  com¬ 
putations  in  both  phases  may  be  performed  by  exactly  the  same  program  modules. 
The  two-phase  nature  of  the  ilgc'-ithm  is  reflected  by  changing  the  function  being 
minimized  from  the  sum  ^  f  in  feasibilities  to  the  quadratic  objective  function.  The 
important  feature  of  tb  ?  type  of  implementation  is  that  if  the  pre-assigned  working 
set  is  similar  to  the  actUe  set,  just  a  few  changes  in  the  working  set  are  necessary  to 
achieve  feasibility  fn  particular,  if  the  initial  point  is  feasbile,  the  procedure  merely 
computes  all  uie  relevant  factorizations  (which  are  also  needed  for  the  QP  phase) 
and  performs  a  feasibility  check. 

^  oir  plications  arise  in  wde£njte  quadratic  programming  problems,  in  which 
th  j  matrix  Z'^HZ  is  indefinite  for  some  Z.  In  this  case,  it  is  not  true  that  any 
constrained  stationary  point  is  a  local  minimum  in  the  current  null  space.  Further¬ 
more,  the  direction  defined  by  (2.9d)  is  not  necessarily  a  direction  of  descent  for  the 
quadratic  function. 
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If  the  iterate  is  a  constrained  stationary  point  with  an  indefinite  projected 
Hessian,  there  must  exist  a  vector  pz  (a  direction  of  negative  curvature)  such 
that  p^Z^HZpz  <  0;  hence,  if  the  problem  contained  no  further  constraints,  the 
quadratic  objective  would  be  unbounded  below  in  the  current  manifold.  However, 
assuming  that  the  QP  has  a  finite  solution,  a  move  along  such  a  direction  must 
encounter  a  constraint  that  is  not  already  in  the  working  set.  Eventually,  enough 
constraints  must  be  added  to  the  working  set  so  that  the  projected  Hessian  becomes 
positive  definite. 

These  observations  lead  to  the  manner  in  which  QPSOL  treats  indefiniteness 
in  the  projected  Hessian.  The  method  finds  a  positive-definite  projected  Hessian. 
When  the  projected  Hessian  is  positive  definite,  it  may  become  indefinite  only  when 
a  constraint  is  deleted  from  the  working  set.  If  this  occurs,  QPSOL  calculates  a 
direction  of  negative  curvature  for  the  search  direction  and  does  not  delete  any 
further  constraints  until  enough  constraints  are  added  to  make  the  projected  Hessian 
positive  definite. 

2.5.2.  Dual-feasible  method  of  Goldfarb  aud  Iduani.  The  dual-feasible 
active-set  QP  method  (Goldfarb  and  Idnani,  1983),  referred  to  in  this  thesis  as  GI, 
requires  a  positive-definite  Hessian.  In  terms  of  solving  the  augmented  system,  GI 
is  a  range-space  projection  method  that  uses  equation  (2.11).  Equation  (2.11)  is 
simplified  in  GI,  since  the  r  term  has  exactly  one  nonzero  component  and  the  Zpz 
term  is  always  zero.  The  strategy  used  by  GI  that  controls  changes  in  the  working 
set  is  described  in  Section  2.3.  GI  maintains  dual  feasibility  while  reducing  r  to  the 
zero  vector.  When  r  becomes  the  zero  vector,  the  current  iterate  is  optimal  and 
feasible  with  respect  to  the  current  working  set.  At  this  point  in  the  algorithm,  a 
violated  constraint  is  added  to  the  working  set,  making  the  new  r  nonzero.  The 
algorithm  continues  in  this  way,  increasing  the  value  of  the  primal  objective  function, 
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until  primal  feasibility  is  achieved.  GI  is  a  single-phase  method,  since  the  origin  in 
the  space  of  dual  variables  is  dual  feasible.  See  Goldfarb  and  Idnani,  1983,  Section 
8,  for  comments  on  how  GI  can  be  extended  to  handle  an  indefinite  Hessian. 

GI  is  a  dual  method,  but  it  is  equivalent  to  applying  an  appropriate  primal 
method  to  the  dual  QP.  It  is  presented  in  Goldfarb  and  Idnani,  1983  and  in  this 
thesis  as  a  dual  method  because,  as  Goldfarb  and  Idnani  point  out,  “we  believe  this 
to  be  more  instructive".  The  presentation  of  G/  as  a  dual  method  will  also  make 
clearer  the  comparison  of  GI  to  the  method  presented  in  this  thesis. 

2.5.3.  Exact-penalty  method  of  Conn  and  Sinclair.  Another  method  for 
solving  QP’s  (and  other  constrained  programming  problems)  is  to  solve  a  related, 
unconstrained  problem  which  has  as  its  objective  function  the  sum  of  F(x)  (the 
objective  function  of  the  QP),  and  a  penalty  term  (see  Fletcher,  1981).  This  term 
imposes  a  positive  penalty  to  an  iterate  which  violates  any  of  the  constraints  of 
the  QP.  A  commonly-used  penalty  function  is  the  absolute  value  penalty  function, 
given  by 

F(z,p)  =  F(i)  -  teV,  (2.15) 

t 

where  r,  is  the  residual  of  the  t-th  constraint  and  V  denotes  the  set  of  violated 
constraints.  Let  £  denote  a  solution  of  the  QP.  There  is  a  threshold  value  such 
that  X  is  an  unconstrained  minimum  of  (2.15)  for  any  p  >  p.  For  this  reason, 
penalty  functions  like  P{x,  p)  are  sometimes  termed  exact  penalty  functions.  (In 
contrast,  there  are  differentiable  penalty  functions  for  which  the  solution  to  the 
unconstrained  problem  with  the  penalty  term  is  not  equal  to  the  solution  of  the 
constrained  problem  for  any  finite  p.  For  a  discussion  of  these  penalty  functions 
and  other  exact  penalty  functions,  see  Gill  et  al.,  1981  and  Fletcher,  1981.)  An  EQP 
has  a  natural  step  length  of  unity.  However,  a  step  of  unity  may  have  no  meaning 
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when  using  (2.15),  because  the  gradient  changes  as  violated  constraints  become 
satisfied.  Possible  strategies  to  determine  a  step  length  when  using  (2.15)  include 
a  line  search  to  find  the  minimum  along  the  search  direction  and  determining  the 
first  constraint  hit  along  the  search  direction. 

The  QP  method  by  Conn  and  Sinclair,  1975  (referred  to  in  this  thesis  as  CS) 
uses  the  exact  penalty  function  (2.15).  The  search  direction  and  multipliers  are 
determined  by  solving  (2.9).  Even  though  CS  solves  an  unconstrained  problem, 
there  is  an  underlying  working  set  which  determines  the  penalty  term  and  the  null 
space,  represented  by  the  matrix  Z.  The  strategy  to  control  changes  to  the  working 
set  is  similar  to  the  active-set  feasible- point  strategy.  It  is  similar  since  the  constraint 
with  the  most  negative  multiplier  is  deleted  from  the  working  set.  However,  the 
line  search  (which  determines  the  minimum  along  the  search  direction)  determines 
which  constraints  to  add  to  the  working  set.  If  the  new  iterate  satisfies  an  additional 
constraint,  it  is  added  to  the  working  set.  Note  that  the  new  iterate  may  violate  one 
or  more  previously  satisfied  constraints.  Since  CS  is  not  a  primal-  or  dual-feasible 
method,  an  initialization  phase  to  find  a  feasible  point  is  not  required.  Instead,  CS 
must  initialize  the  value  of  p,  and  abo  determine  an  upper  bound  on  p,  at  which  the 
QP  problem  is  considered  infeasible,  if  the  solution  of  (2.15)  has  a  positive  penalty. 


Chapter  Three 
The  Single^Phase  Method 


3.1.  Ootlme  of  the  single-phase  method 

The  single-phase  method  of  this  thesis  is  an  active-set  method.  In  order  to  solve  the 
augmented  system,  the  single-phase  method  can  use  the  Lagrangian  method,  or,  if 
the  problem  is  not  large,  a  projection  method.  In  this  chapter,  the  single-phase 
method  uses  the  Lagrangian  method  so  that  it  is  applicable  to  large-scale  problems. 

The  form  of  the  quadratic  programming  problem  solved  is  QPl.  Since  the  only 
inequalities  in  QPl  are  bound  constraints,  only  bounds  enter  or  leave  the  working 
set.  These  changes  to  the  working  set  lead  to  column  changes  in  C. 

The  main  steps  of  the  single-phase  method  are: 

1.  Initialiie.  Start  with  any  z  >  0. 

2.  Test  for  convergence. 

3.  Find  p  and  /x.  Using  a  Lagrangian  method,  solve 

«(_')=-(’).  *=(c  ^)- 

4.  Either  delete  a  bonnd  or  calculate  the  step  length. 

a.  If  a  bound  is  deleted,  update  K  and  return  to  step  2. 

b.  If  the  step  length  is  calculated,  continue  with  step  5. 

5.  Update,  z  ^  z  -t-  ap,  r  ^  (1  -  a)r,  and  ^  (1  -  a)g  + 

If  a  <  1,  add  a  bound  and  update  K. 

Return  to  step  2. 
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The  iterates  of  this  single-phase  method  may  be  infeasible  in  the  sense  that  the 
general  constraints  may  be  violated.  However,  the  bound  constraints  will  always 
be  satisfied.  Thus,  the  bound  constraints  in  the  working  set  are  satisfied,  but 
the  general  constraints  in  the  working  set  may  be  satisfied  or  violated.  However, 
the  violated  constraints  are  satisfied  at  x  -H  p  (i.e.,  at  a  unit  step).  (Note  that  if 
the  current  iterate  is  not  feasible,  the  search  direction  is  not  necessarily  a  descent 
direction.)  With  each  nonzero  step  taken,  the  magnitude  of  the  residual  of  each 
violated  constraint  decreases,  so  that  eventually  a  feasible  point  is  found  (when  one 
exists).  Once  a  feasible  point  is  found,  the  method  is  equivalent  to  an  active-set 
feasible-point  method. 

In  order  to  solve  the  augmented  system,  the  matrbc  K  must  be  nonsingular.  In 
an  active-set  feasible-point  method  (see  Section  2.3),  K  can  become  singular  only  if 
the  projected  Hessian  becomes  singular.  However,  in  the  single-phase  method  of  this 
thesis,  K  can  become  singular  if  either  the  projected  Hessian  becomes  singular  or  if 
the  constraints  become  dependent.  A  general  strategy  that  maintains  nonsingularity 
in  K  aa  constraints  are  added  and  deleted  is  presented  in  Section  3.2.  The  single¬ 
phase  Lagrangian  method  will  be  described  in  more  detail  in  Section  3.3. 
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S.2.  Background:  singnlarity  in  the  angmented  aystem 

If  a  change  in  the  working  set  results  in  a  singular  K,  a  strategy  must  be  used 
that  will  restore  nonsingularity.  If  a  bound  is  hit  (added  to  the  working  set)  the 
dimension  of  the  projected  Hessian  decreases  by  one.  If  the  projected  Hessian  is 
positive  definite,  the  new  projected  Hessian  is  necessarily  positive  definite.  If  the 
projected  Hessian  is  not  positive  definite,  the  new  projected  Hessian  may  be  singular. 
Thus,  if  a  bound  is  hit  and  the  new  K  is  singular,  either  the  projected  Hessian  is 
singular  or  the  constraints  are  dependent.  In  this  event  the  Singularity  Rule  (SR) 
will  be  used  to  determine  a  K  that  is  nonsingular.  (The  Singularity  Rule  will  be 
described  in  the  next  subsection.)  If  a  variable  is  released  from  its  bound  (deleted 
from  the  working  set)  and  the  new  K  is  singular,  the  projected  Hessian  is  singular. 
If  this  occurs,  a  strategy  will  be  employed  that  makes  a  slight  perturbation  to  the 
Hessian  to  determine  a  K  that  is  nonsingular.  These  two  strategies  are  described 
in  the  next  two  subsections. 

S.2.1.  Adding  a  bound.  When  a  variable  is  added  to  the  working  set,  thereby 
fixing  the  variable  on  its  bound,  either  the  projected  Hessian  is  singular  or  the 
constraints  are  linearly  dependent. 

The  search  direction  is  calculated  by  solving  the  augmented  system  of  equations 

'(-;)-(:)■  »» 

where  K  is  nonsingular.  Suppose  that  the  step  a  along  the  search  direction  causes 
the  k-ih.  component  of  Zm  to  hit  its  bound.  When  the  k-ih  variable  is  added  to 
the  working  set,  the  matrix  of  the  new  augmented  system  can  be  represented  by 
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The  method  cannot  continue  if  M  is  singular.  Note,  however,  that  if  M  is 
singular,  it  can  have  only  one  sero  eigenvalue.  Consequently,  it  is  possible  to  main¬ 
tain  nonsingularity  in  the  matrix  of  the  augmented  system  by  freeing  one  of  the 
currently  fixed  variables  (other  than  the  one  which  was  just  fixed). 

Therefore,  when  a  bound  is  hit,  it  must  be  determined  whether  or  not  Af  is 
singular.  One  method  to  identify  singularity  in  M  is  to  form  its  factorization. 
Another  method  is  to  use  the  following  Lemma. 

Lemma  1.  Let  y  and  z  solve 

(?")(:)=(o‘)- 

The  matrix  M  is  singular  if  and  only  if  cjz  =  0. 

Proof: 

=>)  If  M  is  singular,  there  exist  vectors  5  and  f  that  satisfy 


However,  z  and  y  are  the  unique  solutions  to  (3.3),  so  S  =  —z,  JF  =  -y,  and  e^z  =  0. 

<=)  If  e^z  is  zero,  z  and  y  solve  (3.4).  Existence  of  a  solution  to  (3.4)  implies 
that  M  is  singular.  | 

A  different  proof  for  Lemma  1  explicitly  uses  the  inertia  of  M  and  K.  The 
inertia  of  a  real  symmetric  matrix  M  is  the  triple 

In(Af)  =  (t,i/,5), 

where  w,  v,  and  S  denote  respectively,  the  number  of  positive,  negative,  and  sero 
eigenvalues  of  M.  A  well  known  expression  for  the  inertia  is 


In(M)  =  ln{K)  +  In(A//if), 


(3.5) 
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where  {M/K)  is  the  Schur  complement  of  K  in  M  (see  Haynsworth,  1968).  Using 
this  formula,  if  K  is  nonsingular,  M  is  singular  if  and  only  if  (M/K)  equals  zero. 
Using  the  definition  of  the  Schur  complement  (see  Cottle,  1974)  gives 

(«/«•)  =  0 )*-■(«?  Of:=-elz.  (3.6) 


Thus,  the  singularity  of  M  may  be  ascertained  by  solving  (3.3),  and  calculating 
e'^z.  If  Af  is  singular,  a  variable  must  be  freed  that  ensures  (i)  the  matrix  in  the 
new  augmented  system  is  nonsingular,  and  (it)  the  new  search  direction  is  a  feasible 
direction  with  respect  to  the  new  free  variable.  (The  vector  p  is  a  feasible  direction 
with  respect  to  constraint  j,  if  any  positive  step  along  p  moves  to  a  point  which 
satisfies  constraint  j.)  The  SR  will  identify  a  variable  to  free,  such  that  these  two 
conditions  will  be  met.  Let  the  vector  a  denote  the  column  of  X  corresponding  to  the 
new  free  variable.  Let  the  vector  h  denote  the  row  and  column  of  corresponding 
to  the  new  free  variable.  Let  the  constant  hjj  denote  the  diagonal  element  of  )( 
corresponding  to  the  new  free  variable.  The  new  H  is  given  by 


H  h 


At  a  step  of  a  along  p,  the  new  residual  and  new  gradient,  denoted  by  f  and  g 
respectively,  can  be  expressed  as 


and 


r=  (1  -a)r, 


(3.7a) 


5  =  (1  -  q),  +  aC^it. 


(3.76) 


The  new  augmented  system,  formed  by  adding  the  A-th  variable  to  the  working 
set  and  deleting  the  variable  corresponding  to  column  a  from  the  working  set,  can 


1  1 
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then  be  written  as 


(H 

h  ek\ 

( 

(^\ 

C 

-A  1  _ 

a^  hjj  1 

7  1 

\el 

J 

\-$l 

lo/ 

(3.8) 


The  new  search  direction  is  The  new  free  variable  is  at  its  lower  bound 

of  zero,  so  if  /?  is  nonnegative,  the  new  search  direction  is  a  feasible  direction  with 
respect  to  the  new  free  variable.  Thus,  the  SR  has  found  a  suitable  variable  to  free 
if  (t)  the  matrix  in  (3.8)  is  nonsingular,  and  (it)  fi  is  nonnegative. 

A  simple  means  to  determine  p  when  the  matrix  in  (3.8)  is  nonsingular  will 
now  be  given.  The  explanation  is  simplified  by  constructing  an  equivalent,  but 
simpler  system,  using  the  transformation  expressed  by  equation  (2.3).  Let  Si  be 
the  identity  matrix  except  that  the  bottom  row  is  given  by 


-(z^  »’■  0  -1), 


(3.9) 


where  z  and  y  solve  (3.3).  Let  S^  be  the  transpose  of  Si.  Define  the  constant  k 
such  that 

k  =  a^y  +  h'^z.  (3.10) 

By  using  Lemma  1,  the  transformed  augmented  system,  constructed  from  (3.8),  can 
be  written  as 


(3.11) 


\ 

( 

\-0) 

=  -Si 

f 

1 

\oJ 

where 


S2 


P\ 

( 

-A 

P 

\-9) 

V-eJ 

(3.12) 
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The  matrix  in  (3.11)  is  nonsingular  if  and  only  if  k  is  not  equal  to  sero.  (By 
expanding  along  the  last  column  and  then  the  last  row,  the  determinant  is  equal  to 
±A:^det(A’).)  If  A:  is  not  equal  to  sero,  (3.11)  can  be  solved  for  P  giving 


-k^  =  . 


(3.13) 


Rewriting  (3.13)  gives 


/?  =  - 


-I- 

aTy  +  h'^z ' 


(3.14) 


Thus,  the  SR  consists  of  searching  for  a  variable  to  free  by  calculating  k  and  P 
for  each  fixed  variable.  If  A:  is  nonzero  and  p  is  nonnegative  for  any  fixed  variable, 
the  variable  is  a  suitable  variable  to  free. 


3.2.2.  Deleting  a  bonnd.  Consider  singularity  in  K  when  a  variable  is  deleted 
from  the  working  set,  thereby  freeing  the  variable  from  its  bound.  In  this  case,  the 
projected  Hessian  must  be  singular. 

The  search  direction  is  calculated  by  solving  (3.1).  Suppose  that  the  A;-th 
variable  is  deleted  from  the  working  set.  The  matrix  of  the  new  augmented  system 


is  then  given  by 


( H  C’^  h 
M=  I  C  a 

\h^  hkk 


(3.15) 


where  a  denotes  the  fc-th  column  of  A,  h  denotes  the  Ifc-th  row  and  column  of  )l, 
excluding  the  A;-th  element,  and  hkic  denotes  the  Ac-th  diagonal  element  of  U.  The 
method  cannot  continue  when  M  is  singular.  Note,  however,  that  if  Af  is  singular,  it 
can  have  only  one  zero  eigenvalue.  Consequently,  it  is  possible  to  maintain  nonsin¬ 
gularity  in  the  matrix  of  the  augmented  system  by  modifying  hkk  by  a  sufficiently 
small  positive  amount,  c. 
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Therefore,  when  a  bound  is  hit,  it  must  be  determined  whether  or  not  M  is 
singular.  One  method  to  identify  singularity  in  M  is  to  form  its  factorization. 
Another  method  is  to  solve  for  the  Schur  complement  of  K  in  M,  {M/K).  In  this 
case, 

(M/K)  =  *»»-{/>’■  <■’■)’■.  (3.16) 

M  is  singular  if  and  only  if  [M/K)  equals  zero.  Thus,  the  singularity  of  M  may  be 
ascertained  by  calculating  {M/K). 

Note  that  if  the  search  direction  is  nonzero,  a  variable  need  not  be  deleted. 
This  implies  that  if  a  variable  is  deleted  when  the  search  direction  is  nonzero  and 
the  new  augmented  system  is  singular,  we  can  avoid  singularity  difficulties  by  not 
deleting  the  variable.  If  the  search  direction  is  the  zero  vector,  the  current  iterate 
is  a  feasible,  constrained  stationary  point.  In  this  event,  a  variable  must  be  deleted 
in  order  to  continue.  If  singularity  results,  a  strategy  must  be  employed  that  will 
restore  nonsingularity,  and  determine  a  descent  direction.  The  following  theorem 
shows  how  this  can  be  accomplished. 

Theorem  1.  Assume  that  the  current  iterate  is  a  feasible,  constrained  stationary 
point.  Let  the  k-th  variable  be  on  its  bound  and  0^  denote  the  corresponding 
multiplier.  Suppose  that  when  the  k-th  bound  is  deleted  from  the  working  set, 
M  (3.15)  is  singular.  Form  M  by  adding  a  small  positive  quantity  c  to  hkk  (i-e., 
hkk  =  hkk  +  If  the  augmented  system  is  solved  using  M,  the  search  direction  is 
a  descent  direction. 

Proof: 

The  search  direction  p,  is  the  zero  vector,  since  the  current  iterate  is  a  feasible, 
constrained  stationary  point.  The  multiplier,  satisfies  the  equation 
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where  gk  is  the  A-th  component  of  the  gradient,  a  is  the  ib-th  colunm  of  A,  and 


(P,m)  solves  (3.1),  with  p  =  0. 

Since  M  is  singular,  {M/K)  equals  sero.  By  modifying  we  force  (M/K) 
to  have  the  value  e.  This  implies  that  JQT  is  nonsingular.  Using  the  augmented 


system  is  given  by 


(3.18) 


where  pk  ^d  gk  are,  respectively,  the  k-ih  components  of  the  new  search  direction, 
p,  and  the  new  gradient,  §.  If  pk  equals  sero,  then  fl  equals  p,  and  p  is  the  eero 
vector,  since  they  must  satisfy  the  original  augmented  system.  This  would  imply 


a^H  =  gk. 


(3.19) 


Equations  (3.17)  and  (3.19)  together  imply  that  Xk  equals  sero,  which  contradicts 
the  hypothesis.  Thus,  pk  must  be  nonsero. 

The  projected  gradient  can  be  expressed  by 


=  PfcAfc. 


(3.20) 


By  solving  (3.18)  and  then  reversing  the  sign  of  p,  if  pk  is  negative,  equation 
(3.20)  implies  that  p'^§  is  negative.  | 

A  small  value  for  e  implies  that  the  matrix  in  (3.18)  is  ill  conditioned.  The 
search  direction  can  be  calculated  independent  of  c  (see  Section  3.3.4). 

A  different  scheme  could  be  employed  when  deleting  a  bound  results  in  a  sin¬ 
gular  M.  The  scheme  is  to  delete  a  different  bound — one  that  does  not  result  in  a 
singular  Af.  However,  it  is  certainly  possible  that  a  singular  Af  would  result  if  any 
one  of  the  bounds  in  the  working  set  were  deleted.  For  this  reason,  this  scheme  is 
not  to  be  recommended. 
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In  summary,  if  the  current  iterate  is  a  feasible,  constrained  stationary  r^mt 
and  the  augmented  matrix  is  singular  after  a  variable  with  a  negative  '-uuUiplier 
is  deleted,  the  strategy  of  making  a  slight  perturbation  to  H  will  \e  used.  This 
strategy  is  appropriate  for  use  when  the  augmented  system  is  solv<  using  the 
Lagrangian  method.  A  similar,  but  slightly  different,  pertf.rbation  t''  H  will  be  used 
in  the  implementation  of  the  single-phase  method  described  ii.  the  next  chapter. 

S.S.  Single-phase  Lagrangian  method 

Important  details  of  how  the  single-phase  method  may  be  implemented  using  the 
Lagrangian  method  to  solve  the  EQP  will  now  be  described.  First,  the  problem  of 
singularity  in  K  will  be  resolved.  Second,  a  means  of  detecting  that  the  problem 
has  no  feasible  point  will  be  explained.  Next,  convergence  of  the  method  to  an 
optimal  point  will  be  proven.  An  efficient  means  of  solving  successive  large-scale 
augmented  systems  follows. 

S.S.l.  Resolntion  of  singularity.  Singularity  in  K  can  occur  when  a  variable 
is  either  deleted  from,  or  added  to,  the  working  set.  In  order  to  detect  singularity, 
either  of  the  two  alternative  strategies  outlined  in  the  previous  section  may  be 
used.  Following  is  a  brief  description  of  a  computational  feature  of  each  of  these 
strategies.  The  first  strategy  calculates  a  factorization  of  M.  If  Mis  singular,  either 
the  SR  is  used  to  search  for  a  suitable  variable  to  free  from  its  bound,  or  a  slight 
perturbation  to  R  is  made.  Note  that  if  M  is  singular  when  a  variable  is  added 
to  the  working  set,  a  factorization  of  K  must  be  restored.  Thus,  the  work  to  find 
a  factorization  of  Af  and  then  restore  a  factorization  of  K  has  only  provided  the 
information  that  M  is  singular.  The  second  strategy  consists  of  finding  (M/K).  If 
{MfK)  is  not  zero,  Af  is  nonsingular.  In  this  event,  Af  is  known  to  be  nonsingular 
before  its  factorization  is  formed,  but  the  work  to  find  {M/K)  has  only  provided 
the  information  that  Af  is  nonsingular.  If  {M/K)  is  sero,  Af  is  singular,  and  either 
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the  SR  is  used  to  search  for  a  suitable  variable  to  free  or  a  slight  perturbation  to 
H  is  made.  Both  of  these  strategies  of  maintaining  nonsingularity  in  K  require  a 


significant  amount  of  computation  just  to  determine  if  Af  is  singular. 

If  singularity  occurs  when  a  variable  is  deleted  from  the  working  set,  the  strat¬ 
egy  of  modifying  the  Hessian  discussed  in  the  previous  section,  may  be  used  in  a 
straightforward  manner. 

If  the  addition  of  the  k-th  variable  to  the  working  set  causes  singularity,  the  SR 
will  be  used.  If  the  SR  finds  a  variable  to  free  from  its  bound,  a  search  direction 
can  be  computed  and  the  method  can  continue.  If  the  SR  cannot  find  a  suitable 
variable,  the  problem  might  not  have  a  feasible  point.  In  order  to  determine  if  the 
problem  does  not  have  a  feasible  point,  the  k-th  variable  is  not  added  to  the  working 
set.  Instead,  artificial  bounds  are  added  one  at  a  time,  keeping  C  independent. 
(The  artificial  bound  corresponds  to  the  current  value  of  the  variable.)  After  each 
artificial  bound  is  added,  an  attempt  is  made  to  add  the  k-th  variable  to  the  working 
set.  If  Af  is  singular,  the  SR  is  used  again.  If  it  does  not  find  a  suitable  variable 
to  free,  another  artificial  bound  is  added.  As  these  artificial  bounds  are  added, 
either  (i)  the  k-th  variable  will  be  successfully  added  to  the  working  set,  or  (ti)  the 
projected  Hessian  will  become  positive  definite.  Eventually,  the  projected  Hessian 
must  become  positive  definite.  If  the  projected  Hessian  is  positive  definite,  Af  is 
singular  when  the  k-th  variable  is  added,  and  the  SR  still  fails  to  find  a  suitable 
variable  to  delete,  the  problem  does  not  have  a  feasible  point  (see  Section  3.3.2). 

8.S.2.  Detection  of  an  infeasible  problem.  The  inability  of  the  SR  to  find  a 
suitable  variable  to  delete  does  not  necessarily  imply  that  there  is  no  feasible  point. 
However,  if  the  projected  Hessian  is  positive  definite  and  hitting  a  bound  results  in 
a  singular  Af,  then  it  will  be  shown  that  the  new  C  must  be  rank  deficient.  In  this 
case,  if  the  SR  does  not  find  a  suitable  constraint,  the  problem  is  infeasible.  Before 
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proving  this,  some  preliminary  Lemmas  are  required. 

In  Lemmas  2,  3,  4,  and  5,  and  Theorem  2,  the  following  assumptions  are  made: 
(0  the  matrices  F  and  C  from  (3.1)  have  been  used  to  calculate  where  Z 

satisfies  CZ  =  0,  and  (ti)  z  and  y  solve  (3.3). 

Lemma  2.  If  M  (3.2)  is  singulzr  and  Z'^HZ  is  positive  deSnite,  then  z  is  zero. 
Proof:  Write  z  in  terms  of  its  range-space  and  null-space  components  as 

^  =  ^(z  +  y(y. 

Equation  (3.3)  gives 

0  =  CZ=  CZ^g  +  CY(y  =  CYCy. 

CY  is  nonsingular  so  fy  is  zero,  and  z  can  be  written  as  s  =  Zf*.  Using  (3.3)  again 
gives 

C^y+Fs  =  efc.  (3.2i) 

Premultiply  (3.21)  by  giving 

SlZ^HZ(.  =  (3.22) 

CZ  is  zero,  and  Lemma  1  implies  that  z’efc  is  zero,  so  (3.22)  simplifies  to 

fJ(Z^FZ)f^  =  0.  (3.23) 

Since  Z^F Z  is  positive  definite,  is  zero,  which  implies  that  z  is  zero.  | 
Lemma  2.  If  M  (3.2)  is  singular  and  Z^HZ  is  positive  deSnite,  then  e^Z  =  0. 
Proof:  Lemma  2  and  (3.3)  imply  that 

=  efc.  (3.24) 

Premultiplying  (3.24)  by  Z^  gives 

Z^C’’y  =  Z^ek.  (3.25) 

Since  C Z  equab  zero,  (3.25)  implies  that  eJZ  is  zero.  | 
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Lemma  4.  If  e^Z  =  0,  then  M  (3.2)  is  singular. 

E^quation  (3.3)  implies  that  Cz  is  zero.  This  implies  that  z  can  be  written  as 
z  =  Z<^z-  Premultiplying  this  by  cj"  implies  e^z  is  zero,  since  e^Z  is  zero.  By  Lemma 
1,  M  is  singular,  since  cj'z  equals  zero.  | 

As  the  next  lemma  demonstrates,  singularity  in  Af  can  be  attributed  to  either 
the  projected  Hessian  or  the  constraints.  Let  Ck  denote  the  matrix  C  with  the  k-th 
column,  Ck,  deleted. 

Lemma  5.  C*  is  ranJc  deScient  if  and  only  if  2  =  0. 

Proof: 

=>)  Rank  deficiency  of  Ck  implies  the  existence  of  a  nonzero  vector,  y,  such  that 
Cjy  is  zero.  The  matrix  C  having  full  rank  implies  that  c\y  is  nonzero.  Together, 
these  facts  imply  the  existence  of  a  vector  y  which  satisfies  C^y  =  e^.  This  implies 
that  2  is  zero. 

<=)  2  being  the  zero  vector  implies  the  existence  of  a  vector  y  which  satisfies 
C^y  =  Ck-  The  vector  y,  therefore,  abo  satisfies  Cjy  =  0.  This  implies  that  Ck  is 
rank  deficient.  | 

Theorem  2.  Suppose  that  Z^HZ  is  positive  deGnite,  the  search  direction  bits  the 
k-tb  bound  with  an  a  less  than  one,  and  M  (3.2)  is  singular.  If  the  SR  calculates 
that  either  k  is  zero  or  (3  is  negative  for  each  fixed  variable,  then  there  is  no  feasible 
point. 

Proof:  Farkas’  Theorem  of  the  Alternative  states  that  exactly  one  of  the  two  fol¬ 
lowing  alternatives  is  true: 

(  i)  there  exists  a  vector  i  satisfying  Ax  =  b  and  i  >  0; 

(»i)  there  exists  a  vector  ir  satisfying  w'^A  <  0  and  >  0. 

The  proof  will  demonstrate  that  the  SR  has  constructed  a  x  satisfying  (ii). 
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Lemma  2  implies  that  z  is  the  sero  vector.  This  implies  that  y  solves  (3.24). 
Premultiplying  (3.24)  by  gives 

P^C^y  =  P^ek-  (3.26) 

By  using  Cp  =  — r  and  f  =  (1  —  ot)r  (equations  (3.1)  and  (3.7o),  respectively), 
equation  (3.26)  can  be  simplified  to 

y^f=(a-l)pfc,  (3.27) 

where  pk  is  the  fc-th  component  of  p.  The  A:-th  bound  was  hit  by  p,  so  pk  is  negative. 
Thus,  equation  (3.27)  implies  that  y^f  is  positive. 

Since  z  is  the  zero  vector,  the  definitions  of  k  and  P  given  in  (3.10)  and  (3.14), 
respectively,  simplify  io  k  =  a^yt  and 

^  =  -V-  (3-28) 

y^a 

If  A:  is  zero,  P  is  undefined.  By  hypothesis,  P  is  negative  or  undefined  for  all 
fixed  variables.  This  fact,  together  with  the  definition  of  P  given  in  (3.28),  and  y^f 
being  positive,  implies  that  y^o  is  nonnegative  for  all  possible  vectors  o. 

Recall  that  can  be  partitioned  into  free  and  fixed  components  by 

/(  =  (/frn  /(rx)  =  (C  Arx)  -  (3.29) 

Let  IT  =  -y.  Using  (3.29)  and  (3.24)  gives 

w^A  =  -y^A  =  -f(C  X„)  =  -(.J  yi-X,,).  (3.30) 

It  has  been  shown  that  y^a  is  nonnegative  for  every  column  of  Arx-  Together  with 

(3.30),  this  implies  that 


A 
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Let  £rR  =  +  otp-  Using  the  definition  of  r,  and  equations  (3.24)  and  (3.29)  gives 

ir^6  =  -  f) 

=  —  y^Arx^rx  +  y^^ 

—  ~  y^ Arx^rx  +  (3.32) 

The  ifc-th  component  of  Xrx  equals  sero,  frx  equals  rero,  and  y^f  is  positive,  so 
(3.32)  implies  that 

ir'^b  >  0.  (3.33) 

Equations  (3.31)  and  (3.33)  satisfy  alternative  (u),  so  there  is  no  feasible  point. 

I 

8.3.3.  Finite  convergence  to  an  optimal  point.  Since  the  working  set  con¬ 
tains  all  the  general  constraints,  a  feasible  point  has  been  found  when  r  equals  zero. 
During  every  iteration  of  the  method,  r  is  reduced  to  (1  —  or)r.  If  a  is  positive  for 
every  iteration,  r  converges  to  zero  and  a  feasible  point  will  have  been  found. 

Before  proving  that  the  algorithm  will  find  a  feasible  point,  if  one  exists,  two 
assumptions  must  be  made. 

1.  The  initial  point  has  at  least  m  positive  components,  and  the  initial  working 
set  includes  all  zero-valued  variables. 

2.  Movement  along  the  search  direction  hits  at  most  one  bound  at  the  step  of  a. 

The  first  assumption  can  easily  be  met  by  making  at  least  m  components  of  the 
initial  point  positive,  and  including  all  zero-valued  variables  in  the  initial  working 
set.  The  second  assumption  (the  uondegeneracy  assumption),  includes  the  more 
standard  nondegeneracy  assumption  that  movement  along  the  search  direction  will 
not  hit  a  point  that  exactly  satisfies  linearly  dependent  constraints.  However,  the 
nondegeneracy  assumption  used  here  is  only  slightly  more  restrictive  than  the  stan¬ 
dard  one.  The  standard  assumption  can  be  equivalently  stated  as:  If  the  search 
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direction  has  d  degrees  of  freedom,  movement  along  it  can  hit  at  most  d  constraints 
simultaneously  (such  that  the  new  iterate  does  not  satisfy  linearly  dependent  con¬ 
straints).  The  assumption  used  here  can  be  equivalently  stated  as:  If  the  search 
direction  has  d  degrees  of  freedom,  movement  along  it  can  hit  at  most  one  bound 
at  the  step  of  a.  Thus,  when  d  is  one,  the  assumptions  are  equivalent.  (Since 
the  initial  point  is  arbitrary,  the  second  assumption  could  be  enforced  simply  by 
perturbing  the  variables  corresponding  to  all  but  one  of  the  bounds  hit.) 

A  pseudo- Fortran  version  of  how  the  single-phase  method  finds  a  feasible  point  is 
given  in  Table  1.  This  version,  which  is  used  in  the  following  proof,  permits  variables 
to  be  deleted  from  the  working  set  only  by  a  constraint  exchange  performed  as  the 
result  of  using  the  SR. 

Theorem  3.  The  algorithm  will  find  a  feasible  point,  or  indicate  that  no  feasible 
point  exists,  in  a  finite  number  of  steps. 

Proof:  By  the  first  assumption,  the  initial  a  must  be  nonzero,  since  all  zero-valued 
variables  are  in  the  working  set.  If  a  is  one,  the  next  iterate  is  feasible.  If  a  is  less 
than  one,  by  the  second  assumption,  only  one  additional  variable  has  a  zero  value 
at  the  next  iterate.  After  each  pass  through  the  inner  repeat  loop,  one  of  three 
things  must  happen: 

1.  The  loop  is  terminated  because  the  SR  finds  a  suitable  constraint  to  delete.' In 
this  case,  a  constraint  exchange  occurs.  Since  the  SR  was  used,  the  next  search 
direction  calculated  is  a  feasible  direction  with  respect  to  the  freed  variable. 
All  zero-valued  variables  (except  the  one  freed)  are  in  the  new  working  set  so 
the  next  a  is  positive.  This  implies  that  the  value  of  the  freed  variable  will  be 
positive  at  the  next  iterate. 
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2.  The  loop  is  terminated  because  the  SR  did  not  find  a  suitable  variable  to  delete 
and  the  projected  Hessian  is  positive  definite.  In  this  case,  Theorem  2  implies 
that  no  feasible  point  exists. 

3.  The  loop  is  repeated  because  the  SR  did  not  find  a  suitable  variable  to  delete 
and  the  projected  Hessian  is  not  positive  definite.  The  bound  hit  is  not  added  to 
the  working  set.  In  this  case  an  artificial  bound  is  added  (such  that  C  remains 
linearly  independent). 

Note  that  when  the  SR  calculates  P  for  these  artificial  bounds  in  later  iterations, 
P  can  be  positive  or  negative,  since  any  direction  is  feasible  with  respect  to  an 
artificial  bound.  After  a  finite  number  of  artificial  bounds  are  added,  the  projected 
Hessian  must  become  positive  definite.  Thus,  after  a  finite  number  of  passes  through 
the  inner  repeat  loop,  case  1  or  case  2  must  occur. 

Because  a  is  always  positive,  the  residuals  wiU  converge  to  zero  and  a  feasible 
point  will  be  found,  if  one  exists.  This  convergence  to  a  feasible  point  will  occur  in 
a  finite  number  of  steps,  if  the  feasible  region  is  bounded.  An  infinite  sequence  in 
a  closed,  bounded  region  has  at  least  one  limit  point.  Let  f  denote  a  limit  point 
of  the  algorithm.  There  exists  a  positive  radius  determining  a  sphere  about  t  such 
that  the  only  constraints  within  the  sphere  pass  through  f .  Denote  this  sphere  by 
S.  After  a  finite  number  of  steps,  an  iterate  will  enter  S.  Once  an  iterate  lies  in 
5,  all  the  bounds  in  the  working  set  are  satisfied  exactly  at  the  current  iterate  and 
at  2.  The  residuals  decrease  monotonically  from  their  value  at  the  current  iterate 
to  zero  at  2.  Movement  to  a  point  outside  S  from  a  point  in  S  would  mcrease  the 
residuals.  This  implies  that  once  the  iterates  enter  S,  they  will  remain  in  5.  The 
bounds  hit  inside  5  are  bounds  that  are  satisfied  exactly  at  2.  Therefore,  after  a 
finite  number  of  bounds  are  hit  inside  5,  the  iterates  will  converge  to  2.  Thus,  the 
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Imble  1 

Pgendo- Fortran  veraion  of  finding  a  feagible  point 

repeat 

compute  p,  a 
if  (  bound  is  hit  )  then 
repeat 

if  (  new  K  is  singular  )  then 

i£  (  SR  finds  constraint  to  delete  )  then 
exchange  constraints 

else 

if  (  projected  Hessian  is  positive  definite  )  then 
error  =  tme 

else 

add  artificial  bound 

end  if 

end  if 

else 

add  constraint 

end  if 

nntil  f  new  K  is  nonsingular  or  error  ) 

end  if 

fpK  =  *FR  + 

feas  =  (  iterate  satisfies  all  constraints  ) 
nntil  (  feas  or  error  ) 


algorithm  will  find  a  feasible  point,  if  one  exists,  in  a  finite  number  of  steps.  | 
Once  a  feasible  point  has  been  found,  the  single-phase  method  becomes  an 
active-set  feasible-point  method  which  will  determine  an  optimal  point.  The  proof 
of  this  depends  upon  the  strategy  used  to  control  changes  to  the  working  set.  For 
example,  when  the  implementation  of  the  single-phase  method  described  in  the  next 
chapter  finds  a  feasible  point,  the  implementation  and  QPSOL  become  essentially 
the  same  algorithm.  For  a  convergence  proof,  see  Section  4.8. 

S.8.4.  Solving  the  augmented  system.  In  order  to  be  applicable  to  large- 
scale  problems,  the  single-phase  method  uses  a  Lagrangian  method  to  solve  the 
augmented  system.  One  of  the  fastest  methods  for  solving  a  single  system  of  the 
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form  (3.1)  is  by  use  of  sparse  matrix  factorization  techniques  (see  e.g.,  Duff  and  Reid, 
1984;  George  and  Liu,  1980).  However,  within  the  context  of  a  single  QP  iteration, 
where  just  a  single  column  of  the  matrix  C  may  change,  the  factorization  of  a  sparse 
system  from  scratch  would  clearly  be  inefiScient.  We  shall  now  describe  how  a  single 
sparse  factorization  of  a  system  of  the  form  (3.1),  when  used  in  conjunction  with 
the  updated  factorization  of  a  small  dense  matrix,  may  be  used  to  compute  p  and 
fi  for  many  (possibly  all)  QP  iterations. 

Equation  (3.1)  will  be  solved  using  the  Schur  complement  update  (see,  e.g., 
Bisschop  and  Meeraus,  1977,  1980,  Gill  et  af.,  1984a).  Given  factorizations  of  the 
matrices  B  and  S  =  N  —  (the  Schur  complement),  the  solution  of  the 

partitioned  system 


((f=  N){l)-{d) 

(3.34) 

can  be  determined  by  solving,  in  turn,  the  equations: 

Bxv  =  6; 

(3.35o) 

Sij  =  d  —  U^w; 

(3.356) 

Bv  =  b-  Ut}. 

(3.35c) 

We  now  consider  how  to  dehne  the  quantities  in  (3.34)  corresponding  to  (3.1). 
The  symmetric  matrix  B  is  the  partitioned  (n,R  +  m)*dimensional  matrix,  K.  The 
vector  p  is  the  last  m  components  of  v,  and  h  is  dehned  by 

The  vector  p  is  composed  of  selected  elements  of  ri  and  v.  Each  change  in  the 
working  set  leads  to  a  new  column  in  U  (row  in  U^).  The  definition  of  the  column 
of  U  and  the  corresponding  element  of  d  depends  on  whether  a  bound  is  added  or 
deleted. 
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We  illustrate  the  dennition  of  U,  N  and  d  using  the  first  iteration  as  an  example. 
Suppose  that  a  bound  indexed  by  j,  with  the  vector  a  denoting  the  corresponding 
column  of  A,  is  deleted  from  the  working  set  so  that 

C  =  iC  a). 

Let  the  vector  h  denote  the  j-th  row  and  ;-th  column  of  U,  excluding  the  j-th 
element,  and  the  constant  hjj  denote  the  ;-th  diagonal  element  of  )l.  Let  gj  denote 
the  j-th  component  of  the  gradient  vector. 

The  vector  vj  and  iji  then  satisfy 


H  h 


C  0  a 


a’'  h. 


(3.36) 


and  may  be  found  from  (3.35)  with 


N  =  hjj  and 


'■0 


Suppose,  on  the  other  hand,  that  the  first  free  variable  in  C  is  added  to  the 
working  set.  The  system  to  be  solved  is  then 


H  «i 


^0 


(3.37) 


ef  0  0 


where  Cj  is  the  first  coordinate  vector.  Note  that  the  effect  of  the  vector  «i  and  the 
zero  component  in  the  right-hand  side  of  (3.37)  is  to  force  the  first  component  of 
vi  to  be  zero.  The  vector  v\  and  171  are  computed  from  (3.35)  with 


Section  S.S. 


Single-phase  Lagrangian  method  39 


With  this  approach,  the  procedure  for  computing  p  and  n  after  each  change  in 
the  working  set  is  the  same  whether  a  bound  is  added  or  deleted.  In  either  case,  the 
change  consists  of  appending  another  row  to  U,  row  and  column  to  jV,  and  element 
to  d. 

Theorem  1  modihes  hjj  by  adding  e  when  the  matrix  in  (3.18)  is  singular.  We 
shall  now  show  how  the  search  direction  can  be  calculated  independent  of  e.  In 
order  to  conform  to  the  nomenclature  of  this  section,  equation  (3.36)  will  be  used 
instead  of  (3.18). 

When  the  matrix  is  modihed  by  adding  e  to  hjj,  S  equals  c.  Thus,  equation 
(3.356)  gives 


EJquation  (3.35c)  gives 


^-7  =  -»-(;) 

(0 -(")”■ 


=(:) 


Since  c  can  be  chosen  arbitrarily  small,  the  limit  of  vl\\v\\  as  c  — »  0  can  be  used  as 
the  solution  of  (3.36).  This  limit  is  given  by 


where 


(/.)• 

'=(*) 


The  essential  operations  in  (3.35)  are  two  solves  with  B  and  a  solve  with  S.  If 
the  number  of  iterations  is  small  enough  (say,  less  than  100),  S  may  be  treated  as  a 
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dense  matrix.  It  is  then  straightforward  to  use  an  orthogonal  factorization  QS  =  R 
(QJ-Q  =  I,  R  upper  triangular)  or  an  analogous  factorization  LS  =  U  based  on 
Gaussian  elimination  {L  square,  U  upper  triangpilar).  These  factorizations  can  be 
maintained  in  a  stable  manner  as  5  is  updated  to  reflect  changes  to  C.  (The  updates 
involve  adding  rows  and  columns  of  5;  see  Gill  and  Murray,  1974). 

An  implementation  of  this  method  requires  the  availability  of  a  symmetric  in¬ 
definite  sparse  equation  solver  (e.g.,  the  subroutine  MA27,  Duff  and  Reid,  1984) 
and  a  procedure  for  updating  the  LU  or  LQ  factors  of  the  Schur  complement  as 
rows  and  columns  are  added.  We  have  described  how  to  add  bounds  to  the  working 
set  that  are  not  initially  in  the  working  set.  If  we  wish  to  add  a  bound  that  was  ini¬ 
tially  in  the  working  set,  i.e.,  add  a  row  to  U,  this  addition  may  be  achieved  instead 
by  deleting  the  row  from  U  that  was  added  in  order  to  delete  the  bound  from  the 
initial  working  set.  This  will  have  the  beneficial  effect  of  reducing  the  dimension  of 
the  Schur  complement.  Similarly,  if  we  wish  to  delete  a  bound  that  was  not  initially 
in  the  working  set,  i.e.,  add  a  row  to  U,  this  deletion  may  be  achieved  instead  by 
deleting  the  row  from  U  that  was  added  in  order  to  add  the  bound  to  the  initial 
working  set.  (It  may  be  worthwhile  to  favor  such  changes  to  the  working  set.  since 
they  decrease  the  dimension  of  5.) 

As  iterations  proceed,  the  dimension  of  the  Schur  complement  will  grow  and 
the  work  required  in  (3.356)  increases  with  every  change  in  the  working  set.  As  a 
consequence,  it  will  be  periodically  necessary  to  abandon  the  current  Schur  com¬ 
plement  and  factorization  of  B.  At  this  point  the  matrix  B  may  be  redefined  with 
constraints  from  the  latest  working  set  and  refactorized.  (The  situation  is  similar 
to  that  in  linear  programming,  where  product  forms  of  the  basis  factors  must  be 
periodically  condensed.)  The  number  of  updates  that  need  be  performed  before  a 
refactorization  becomes  worthwhile  will  vary  with  each  problem.  While  the  optimal 
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frequency  number  may  vary  considerably  from  problem  to  problem  we  anticipate 
that  typically  100  updates  will  be  possible  before  refactorization. 

The  normal  procedure  for  a  QP  algorithm  is  to  solve  for  the  search  direction  p 
directly.  We  shall  show  that  the  Schur  complement  is  more  eflBciently  implemented 
by  solving  for  q,  where  Iq  +9  is  the  minimum  on  the  working  set  defined  by  C.  The 
improved  efficiency  is  derived  from  the  observation  that  the  right-hand  side  vector 
6  will  be  constant  for  all  iteration^.  Consequently,  the  vector  w  need  be  computed 
only  once,  and  only  the  last  component  of  U'^w  changes  at  each  iteration.  This 
reduces  the  work  required  to  solve  (3.34)  by  approximately  half. 

Instead  of  solving  (3.1),  the  system 

{I  ?)(-:)-(«). 

is  solved  instead,  where  xq  is  the  initial  point.  The  vector  p  can  be  determined  from 
the  relationship 

p  =  Xq  -  T  ■¥  q. 

When  a  variable  is  deleted  from  the  working  set,  6  (3.34)  stays  constant  and  a 
single  component  is  appended  to  d  (see  (3.36)).  However,  if  (3.37)  is  used  when  a 
variable  is  added  to  the  working  set,  6  does  not  remain  constant.  In  order  to  allow 
b  to  remain  constant  in  this  case,  the  right-hand  side  of  (3.37)  can  be  replaced  by 


where  ri  is  the  residual  (at  zq)  of  the  variable  added  to  the  working  set.  The 
effect  of  the  vector  Ci  and  the  component  r;  is  to  force  Xq  -i-q  to  satisfy  exactly  the 
bound  added  to  the  working  set.  Thus,  the  dimension  of  d  increases,  but  6  remains 
constant  when  we  solve  for  q  instead  of  directly  for  p. 
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The  detection  of  singularity  in  Af  (given  by  (3.2)  or  (3.15)),  as  described  in 
Section  3.2,  required  a  significant  amount  of  computation.  However,  by  using  the 
Schur  complement  formed  in  the  above  calculations,  the  amount  of  computation 
can  be  significantly  reduced.  This  reduction  is  based  on  the  observation  that  the 
new  5  is  nonsingular  if  and  only  if  M  is  nonsingular.  Thus,  singularity  in  M  can 
be  detected  while  forming  the  factorization  of  the  new  5. 

In  a  null-space  projection  method  the  projected  Hessian  serves  the  dual  purpose 
of  solving  for  the  search  direction  and  verifying  the  necessary  optimality  condition 
that  the  projected  Hessian  be  positive  semi-definite.  In  a  Lagrangian  method,  the 
search  direction  is  determined  without  explicit  use  of  the  projected  Hessian.  How¬ 
ever,  when  the  method  terminates,  the  definiteness  of  the  projected  Hessian  must 
be  determined  in  order  to  verify  the  optimality  condition.  The  projected  Hessian  is 
not  explicitly  available  in  the  Lagrangian  method,  but  its  definiteness  can  easily  be 
determined  from  the  factorization  of  the  current  K,  i.e.,  the  matrix  in  (3.34). 

The  solution  of  (3.34)  involves  the  factorization  of  B  into  LDL"^,  where  L  is 
lower  triangular  and  D  is  block  diagonal  with  blocks  no  larger  than  2x2  (see  Bunch 
and  Kaufman,  1977,  and  Bunch  and  Parlett,  1971).  If  a  matrix  F  is  symmetric  and 
a  matrix  J  is  nonsingular,  then  F  and  JFJ^  have  the  same  inertia.  Therefore, 
B  and  D  have  the  same  inertia.  The  special  form  of  D  makes  its  inertia  (and 
hence,  the  inertia  of  B)  easy  to  find.  The  inertia  of  the  current  K  is  determined 
by  combining  the  inertias  of  5  and  B.  Knowing  the  inertia  of  K,  the  inertia  of 
the  projected  Hessian  is  readily  available  using  the  following  relationship  (Gould, 
1984): 

In(K’)  =  In(Z^jyZ)-f(f,t,0), 
where  t  is  the  number  of  rows  of  C. 

The  inertia  of  S  is  not  usually  determined.  However,  when  the  method  ter- 
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minates,  we  can  calculate  the  inertia  of  S  in  order  to  determine  if  the  optimality 
condition  is  satisfied.  If  it  is  satisfied,  the  current  iterate  is  a  local  minimum.  If  it 
is  not  satisfied,  the  current  iterate  is  a  constrained  stationary  point  which  is  not  a 
local  minimum.  From  this  constrained  stationary  point  there  is  certainly  a  direc¬ 
tion  of  negative  curvature.  However,  no  efficient  means  is  available  to  calculate  a 
direction  of  negative  curvature  in  this  case. 


A  Method  for  Dense  Qnadratic  Programs 


The  single-phase  method  in  the  previous  chapter  uses  a  Lagrangian  method  to 
solve  the  augmented  system.  The  single-phase  method  for  dense  QP’s  described 
in  this  chapter,  QPSFA,  uses  a  projection  method.  (A  pseudo- Fortran  version  of 
QPSFA  is  given  in  Table  2.)  In  order  to  test  most  aspects  of  a  single-phase  method 
it  does  not  particularly  matter  what  method  is  used  to  solve  the  augmented  system, 
assuming  that  the  problems  are  dense.  A  single-phase  method  for  dense  QP’s,  rather 
than  for  large-scale  QP’s,  has  been  implemented  for  several  practical  reasons.  First, 
more  test  problems  can  be  run  since  it  is  computationally  less  expensive  to  test  small 
dense  problems,  than  large-scale  problems.  Second,  much  of  the  code  necessary 
in  QPSFA  was  readily  available  from  QPSOL.  Third,  it  made  a  comparison  with 
QPSOL  possible.  Finally,  we  know  of  no  generally  available  code  for  large-scale  QP. 

For  purposes  of  exposition,  the  quadratic  programming  problem  solved  by 
QPSFA  is  assumed  to  be  stated  in  the  following  form: 

QP2  minimize  c^x  -t-  hx'^Hx 

xeS"  ^ 

subject  to  Ax  >  b. 

The  vector  r  =  Ax  —  b  will  denote  the  residuals. 

All  constraints  in  QP2  are  general  constraints — there  are  no  bound  constraints. 
This  simplification  limits  the  description  of  updates  to  adding  and  deleting  general 
constraints.  (However,  the  implementation  of  QPSFA  treats  bounds  and  general 
constraints  separately,  since  separate  treatment  is  more  efficient;  see  Gill  et  al., 


The  results  of  Chapter  Three  are  stated  for  a  problem  of  the  form  QPl.  These 
results  remain  valid  in  this  chapter  since  a  problem  of  the  form  QP2  can  be  trans¬ 
formed  into  a  problem  of  the  form  QPl  by  adding  slack  variables.  In  a  large-scale 
problem,  there  is  no  particular  objection  to  the  use  of  slack  variables,  since  only 
the  nonzero  elements  of  the  constraint  matrix  are  stored.  Since  all  of  the  elements 
of  the  constraint  matrix  are  stored  in  a  dense  problem,  it  is  preferable  not  to  add 
slack  variables.  QPSFA  does  not  explicitly  use  slack  variables.  Instead,  QPSFA 
implicitly  sets  a  slack  variable  equal  to  the  corresponding  residual,  if  the  general 
constraint  is  satisfied,  and  equal  to  zero,  if  the  general  constraint  is  violated.  This 
avoids  storing  and  updating  the  slack  variables,  but  implies  that  movement  along 
the  search  direction  may  hit  more  than  one  zero-valued  slack  variable  bound  at  a 
step  of  zero.  Thus,  the  nondegeneracy  assumption  of  Theorem  2  may  not  be  sat¬ 
isfied.  However,  very  rarely  does  this  present  a  problem  in  practice  (see  Section 
4.9). 

The  first  section  contains  an  outline  of  QPSFA.  The  next  two  sections  describe 
the  required  factorizations  and  how  they  are  updated.  Calculation  of  the  search 
direction  from  a  non-optimal  constrained  stationary  point  is  presented  in  Section 
4.4.  Section  4.5  contains  a  description  of  how  QPSFA  handles  an  indefinite  Hessian. 
[QPSFA  maintains  a  projected  Hessian  that  has  at  most  one  nonpositive  eigenvalue. 
In  some  instances  in  this  thesis  it  may  have  been  more  complete  to  use  the  phrase 
“Hessian  of  unspecified  definiteness”  instead  of  “indefinite  Hessian”.  However,  this 
longer  phrase  has  not  been  used  in  order  to  avoid  wordiness.)  Constraint  deletion 
strategies  are  then  described  in  Section  4.6.  Resolution  of  singularity  in  the  aug¬ 
mented  system  is  discussed  in  Section  4.7.  Infeasibility  detection  and  convergence 
to  an  optimal  point  are  discussed  in  Section  4.8.  The  last  section  compares  QPSFA 
to  the  QP  methods  described  in  Section  2.6. 
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Outline  of  QPSFA  ^6 


4.1.  Outline  of  QPSFA 
The  main  steps  of  QPSFA  are: 

1.  Initialize.  Determine  which  violated  or  exactly  satisfied  constraints  are  in 
the  initial  working  set.  Let  the  subscripts  “w*  and  “i”  denote  a  vector  or 
matrix  whose  elements  are  associated  with  constraints  in  the  working  set  and 
constraints  not  in  the  working  set,  respectively.  Thus,  the  constraints  in  the 
working  set  determine  the  f  x  n  matrix  Aw  (t  <  n),  and  the  vectors  bw  and  r„.. 

2.  Test  for  convergence. 

3.  Find  p  and  /x.  Using  a  projection  method,  solve 

4.  Either  delete  a  constraint  or  calculate  the  step  length. 

a.  If  a  constraint  is  deleted,  update  K  and  return  to  step  2. 

b.  If  the  step  length  is  calculated,  continue  with  step  5.  (If  ajp  <  0  for 
any  violated  constraint,  a  is  zero.  Otherwise,  a  is  the  step  to  the  nearest 
satisfied  constraint  (not  in  the  working  set)  along  the  search  direction.) 

5.  Update,  x  x  +  ap,  (1  —  o)r,v,  and  r,  *—  r,  +  aA,p. 

If  a  <  1,  add  a  constraint  and  update  K. 

Return  to  step  2. 

Note  that  a  is  zero  whenever  aJp  is  negative  for  any  violated  constraint.  This 
implies  that  the  residual  of  every  violated  constraint  is  nondecreasing. 

4.2.  Factorisation  of  the  working  set  and  the  projected  Hessian 

QPSFA  is  a  projection  method  that  uses  equation  (2.9)  to  solve  the  augmented 
system  for  the  search  direction  and  the  multipliers.  In  order  to  use  these  equations, 
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representations  of  T,  Z,  Y,  and  R  are  needed.  In  this  section  we  describe  these 
matrices  in  more  detail. 

Let  nz  denote  n  -  t,  the  number  or  columns  of  Z.  The  columns  of  the  n  x 
matrix  Z  form  a  basis  for  the  subspace  of  Aw  (i-e.,  A^Z  =  0).  If  f  is  sero,  Z  may 
be  taken  as  the  n-dimensional  identity  matrix. 

The  matrix  Z  may  be  defined  from  the  TQ  factorization  of  Aw  The  TQ 
factorization  of  Aw  is  defined  by 

AwQ=iO  T),  (4.1) 

where  Q  is  an  n  x  n  matrix,  and  T  is  an  <  x  <  “reverse"  triangular  matrix  such  that 

=  0  for  i  +  j  <  t.  When  Q  is  orthonormal,  the  TQ  factorization  is  simply  a 
permuted  version  of  the  standard  LQ  factorization.  The  reasons  for  favoring  this 
form  will  be  discussed  in  Section  4.3.2  when  procedures  for  updating  the  matrbc 
factorizations  following  a  constraint  deletion,  are  considered. 

From  (4,1)  it  follows  that  the  first  columns  of  Q  can  be  taken  as  the  columns 
of  the  matrix  Z.  Denote  the  remaining  columns  of  Q  by  F  (the  columns  of  Y  form 
a  basis  for  the  subspace  of  vectors  spanned  by  the  rows  of  Aw)- 

The  Cholesky  factorization  of  the  projected  Hessian  is  given  by 

Z'^HZ  =  BTR.  (4.2) 

QPSFA  is  based  on  the  TQ  and  Cholesky  factorizations.  As  mentioned  at  the 
beginning  of  this  section,  the  matrices  to  be  stored  include,  from  (2.9),  the  n  x  n 
matrbc  Q,  the  f-dimensional  reverse  triangular  matrbc  T,  and  the  n^-dimensional 
upper  triangular  matrix  R.  The  matrix  Q  is  conceptually  partitioned  into  two 
submatrices  —  Z,  the  first  n*  columns,  and  Y,  the  last  t  columns,  i.e. 

Q  =  (Z  Y). 
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Changes  in  the  working  set  will  cause  changes  in  these  four  matrices.  From  (4.1) 
we  see  that  any  transformations  applied  to  the  columns  of  Y  will  also  be  applied 
to  the  columns  of  T\  from  (4.2)  it  follows  that  any  transformations  applied  to  the 
columns  of  Z  will  also  be  appUed  to  the  columns  of  R. 

4.3.  Updating  the  factorisations 

Unless  the  correct  active  set  is  known  a  priori,  the  working  set  must  be  modified 
during  the  execution  of  an  active-set  method,  by  adding  and  deleting  constraints. 
Because  of  the  simple  nature  of  these  changes,  it  is  possible  to  update  the  necessary 
matrix  factorizations  to  correspond  with  the  new  working  set.  In  the  remainder  of 
this  section,  we  consider  how  to  update  the  TQ  factorization  (4.1)  and  the  Cholcsky 
factorization  (4.2)  following  a  single  change  in  the  working  set.  If  several  constraints 
are  to  be  added  or  deleted,  the  procedures  are  repeated  as  necessary. 

We  assume  throughout  this  section  that  Q  is  orthonormal.  The  discussion  of 
updates  will  assume  a  general  familiarity  with  the  properties  of  plane  rotations. 
Sequences  of  plane  rotations  are  used  to  introduce  zeros  into  appropriate  positions 
of  a  vector  or  matrix,  and  have  exceptional  properties  of  numerical  stability  (see, 
e.g.,  Wilkinson,  1965,  pp.  47-48). 

We  shall  illustrate  each  modification  process  using  sequences  of  simple  diagrams, 
following  the  conventions  of  Cox  (1981)  to  show  the  effects  of  the  plane  rotations. 
Each  diagram  depicts  the  changes  resulting  from  one  plane  rotation.  The  following 
symbols  are  used: 

K  denotes  a  non-zero  element  that  is  not  altered; 
m  denotes  a  non-zero  element  that  is  modified; 
f  denotes  a  previously  zero  element  that  is  filled  in; 

0  denotes  a  previously  non-zero  element  that  is  annihilated;  and 
(or  blank)  denotes  a  zero  element  that  is  unaltered. 
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In  the  algebraic  representation  of  the  updates,  barred  quantities  will  represent  the 
“new*  values. 


4.3.1.  Adding  a  general  constraint.  When  a  general  constraint  is  added  to 
the  working  set,  its  index  can  simply  be  placed  at  the  end  of  the  list  of  indices 
of  general  constraints  in  the  working  set.  Therefore,  without  loss  of  generality  we 
shall  assume  that  the  new  constraint  is  added  as  the  last  row  of  The  row 
dimension  of  A^-  and  the  dimension  of  T  will  thus  increase  by  one,  and  the  column 
dimension  of  Z  will  decrease  by  one.  Let  denote  the  new  row  of  An, .  Let 
denote  the  vector  a^Q,  and  partition  tv^  as  (u;J  tUy),  so  that  consists  of  the 
first  1*2  components  of  w^.  From  (4.1),  it  follows  that 


AnQ  = 


We  see  that  a  new  matrix  Q  can  be  obtained  by  applying  a  sequence  of  plane  rota¬ 
tions  on  the  right  of  Q  to  transform  the  vector  wj  to  suitable  form;  the  transformed 
matrix  Q  then  becomes  Q.  The  sequence  of  rotations  take  linear  combinations  of 
the  elements  of  tyj  to  reduce  it  to  a  multiple  (say,  7)  of  e^,  where  e^  denotes  the 
n^-th  coordinate  vector.  The  rotations  are  constructed  to  alter  pairs  of  components 
in  the  order  (1,2),  (2,3),  ...,  (n^  —  l|Wz)i  as  indicated  in  the  following  diagrams, 
which  depict  the  vector  wj  as  it  is  reduced  to  'je^: 


K  M  H  X 


)  -  (- 
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■)  -  ( 


The  effect  of  these  transformations  can  be  expressed  algebraically  as 


fP  0\  _  fo  0 

Q  I  I  =  An,  0=1 
Vo  //  Vo  7 
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By  construction,  the  rotations  in  P  affect  only  the  first  columns  of  Q,  so  that 
the  last  t  columns  of  Q  are  identical  to  those  of  Q,  and  the  first  columns  of  Q 
are  linear  combinations  of  the  first  columns  of  Q.  Hence, 

ZF=(2  y), 


where  y,  the  transformed  last  column  of  Z,  becomes  the  first  column  of  F. 

The  plane  rotations  applied  to  Z  also  transform  the  Cholesky  factor  R  of  the 
projected  Hessian.  The  chosen  order  of  the  rotations  in  P  means  that  each  suc¬ 
cessive  rotation  has  the  effect  of  introducing  a  subdiagonal  element  into  the  upper- 
triangular  matrix  R,  as  shown  in  the  following  sequence  of  diagrams.  For  clarity, 
we  again  show  the  vector  toj  at  the  top  as  it  is  reduced  to  (0  7)^;  the  matrices 
underneath  represent  the  transformed  version  of  R. 


Since  the  last  column  of  the  matrix  ZF  is  not  part  of  Z,  the  last  column  of  RP 
can  be  discarded.  The  remaining  matrix  is  then  restored  to  upper-triangular  form 
by  a  forward  sweep  of  row  rotations  (say,  the  matrix  F),  which  is  applied  on  the 
Je/t  to  eliminate  the  subdiagonal  elements,  as  shown  in  the  following  diagrams. 
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Let  R  denote  the  matrix  RP  with  its  last  column  deleted;  then  we  have 


I 


K- 


where  R  is  upper-triangular.  Note  that  the  rotations  in  P  affect  R,  but  not  Q  or 

f. 

The  number  of  multiplications  associated  with  adding  a  general  constraint  in¬ 
cludes  the  following  (where  only  the  highest-order  term  is  given);  n’  to  form  A^.Q\ 
3n1  for  the  two  sweeps  of  rotations  applied  to  R\  and  3nn^  to  transform  the  appro¬ 
priate  columns  of  Q.  (We  assume  the  three-multiplication  form  of  a  plane  rotation; 
see  Gill  and  Murray,  1974.) 

4.3.2.  Deleting  a  general  constraint.  When  a  general  constraint  (say,  the 
i-th)  is  deleted  from  the  working  set,  the  row  dimension  of  Aw  and  the  dimension 
of  T  are  decreased  by  one,  and  the  column  dimension  of  Z  is  increased  by  one. 
From  (4.1),  we  have 

AwQ  =  iO  5), 

where  5  is  an  (<  —  1)  x  f  matrix  such  that  rows  1  through  i  —  1  are  in  reverse 
triangular  form,  and  the  remaining  rows  have  one  extra  element  above  the  reverse 
diagonal.  In  order  to  reduce  the  last  <  —  1  columns  of  5  to  the  desired  reverse 
triangular  form  of  T,  a  backward  sweep  of  plane  rotations  (say,  P)  is  applied  on  the 
right,  to  take  linear  combinations  of  columns  (f  -  i  -f- 1,  t  -  i),  . . .,  (2, 1).  The  effect 
of  these  rotations  is  shown  in  the  following  diagrams  for  the  case  f  =  6  and  t  =  3: 
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This  transformatioD  may  be  expressed  as 

A^Q  °  )  =A^Q  =  [0  T).  (4.3) 

The  first  columns  of  Q  are  not  affected  by  the  rotations  in  P,  and  hence  the 
first  n*  columns  of  2  are  identical  to  the  columns  of  Z.  The  matrix  2  is  given  by 

Z  =  {Z  z\  (4.4) 

where  the  new  (last)  column  2  of  ^  is  a  linear  combination  of  the  relevant  columns 
of  Y .  (When  1  =  t,  no  reduction  at  all  is  needed,  and  2  is  just  the  first  column  of 

y-) 

By  using  the  sweep  of  plane  rotations  described  above,  it  is  now  straightforward 
to  prove  the  following  lemma. 

Lemma  6.  If  the  j-th  general  constraint  is  deleted  from  the  working  set, 

afz  =  crtj, 

where  2  is  the  new  column  of  2  (4.4),  ty  is  the  j^th  diagonal  element  ofT,  and  a  is 
the  product  of  t  —  j  sines. 

Proof: 

By  applying  the  same  plane  rotations  to  ajY,  that  were  used  to  find  T  (4.3), 
the  vector  ajY  will  be  transformed  into  the  vector  Oy  (2  K).  The  element  ay2 

will  be  the  first  component  of  this  transformed  vector. 

A  plane  rotation  is  equal  to  an  identity  matrbc  except  for  rows  and  columns  t 
and  j.  Assuming  i  is  less  than  j,  the  (»,»),  (*,;),  (;, »)  and  (;,;)  elements  have  the 
configuration 
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where  =  1  (hence,  c  =  costf  and  a  =  sin^  for  some  0). 

The  first  t~j  components  of  the  f-dimensional  vector  ajY  are  lero.  The  first  of 
the  remaining  j  components  is  tj.  The  first  plane  rotation  will  form  a  transformed 
vector  that  starts  with  t  —  j  —  I  tero  components,  followed  by  the  product  of  a  sine 
with  tj.  Each  successive  plane  rotation  has  a  similar  effect.  This  implies  that  after 
t  —  j  plane  rotations  have  been  applied  to  ajY,  the  first  component,  ajz,  will  be 
the  product  of  t  -  j  sines  with  tj.  | 

Because  of  the  form  (4.4),  the  new  projected  Hessian  matrix  HZ  is  given  by 


Z'^HZ  =  = 


(R'^R 


V 


(4.5) 


where  t;  =  Z^ Hz  and  0  =  z"^ Hz. 

Let  r  denote  the  solution  of  R^r  =  v;  if  2^H2  is  positive  definite,  the  quantity 
0  -  r^r  must  be  positive.  In  this  case,  only  one  further  step  of  the  row-wise  Cholesky 
factorization  is  needed  to  compute  the  new  Cholesky  factor  R,  which  is  given  by 


where  =  0  ~  r^r.  If  the  matrix  2^  H  2  is  not  positive  definite,  the  Cholesky 
factorization  may  be  undefined  or  ill-conditioned,  and  other  techniques  should  be 
used  to  modify  the  factorization  without  excessive  additional  computation  or  loss 
of  numerical  stability  (see  Section  4.5.) 

The  number  of  multiplications  associated  with  deleting  the  »-th  general  con¬ 
straint  includes  the  following  (where  only  the  highest-order  term  is  given):  |(<  —  *)’ 
to  operate  on  T\  3n(t  —  *)  to  transform  Q;  to  form  Hz\  nnz  to  form  Z"^ Hz-,  and 
jnl  to  compute  the  additional  row  of  the  Cholesky  factor. 
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The  justification  for  using  the  TQ  factorization  arises  from  this  part  of  an 
active-set  method.  From  a  theoretical  viewpoint,  2  would  remain  an  orthogonal 
basis  for  the  null  space  of  regardless  of  the  position  in  which  the  new  column 
appeared.  However,  in  order  to  update  '.he  Cholesky  factors  efficiently,  the  new 
column  must  appear  after  the  columns  of  Z  (otherwise,  (4.5)  would  not  hold).  The 
TQ  factorization  has  an  implementation  advantage  because  the  new  column  of  2 
automatically  appears  in  the  correct  position  after  deletion  of  a  constraint  from  the 
working  set.  With  other  alternatives,  the  housekeeping  associated  with  the  update 
of  R  is  more  complicated.  For  example,  in  an  implementation  based  on  the  LQ 
factorization,  the  new  column  might  be  moved  to  the  end  of  Z,  or  a  list  could  be 
maintained  of  the  locations  of  the  columns  of  Z;  another  alternative  is  to  store  the 
columns  of  Z  in  reverse  order  (see  Gill  and  Murray,  1977). 

4.4.  Search  direction  calcnlation  from  a  stationary  point 

If  the  current  iterate  is  a  non-optimal  constrained  stationary  point  and  the  residuals 
of  the  working  set  are  zero,  the  calculation  of  the  search  direction  can  be  simplified. 
At  a  constrained  stationary  point,  we  know  that 

Z'^g  =  0.  (4.6) 

The  point  is  non-optimal,  so  a  constraint  with  a  negative  multiplier  will  be 
deleted  from  the  working  set.  The  updated  matrix  Z,  denoted  by  Z,  is  given  by 

Z  =  iz  z).  (4.7) 

FVom  (4.6)  we  know  that  g  wiU  be  orthogonal  to  all  columns  of  2  except  the 
last  one.  Hence, 

Z^y  = 


(4.8) 
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Referring  to  equation  (2.9),  note  that  pv  is  sero,  since  r  is  eero.  Using  the 
Cholesky  factorization  of  the  updated  projected  Hessian,  and  equations  (4.8)  and 
(2.96)  gives 

R^Rpz  =  -7^1 .  (4.9) 

The  first  step  in  solving  (4.9)  involves  the  lower-triangular  matrix  i?^.  Because 
of  the  special  form  of  the  right-hand  side  of  (4.9),  the  result  is  a  multiple  of  c^,  and 
hence  the  vector  pz  is  the  solution  of 

Rpz  =  -—ez,  (4.10) 

where  r^  is  the  last  diagonal  element  of  R. 

In  QPSFA,  a  search  direction  must  be  calculated  even  if  the  residuals  are 
nonzero.  Equation  (4.10)  can  be  used  only  if  the  residuals  are  zero.  In  a  feasible- 
point  active-set  method  that  only  deletes  constraints  if  the  projected  gradient  is 
zero,  it  can  be  shown  that  pz  is  always  the  solution  of  an  upper- triangular  system 
of  the  form  (4.10)  (see  Gill  et  al.,  1981).  This  simplification  in  computing  the  search 
direction  will  frequently  apply  to  the  single-phase  method.  Also,  it  will  be  used  to 
define  a  search  direction  when  the  projected  Hessian  is  not  positive  definite. 
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4.6.  Indefinite  QP 

4.6.1.  Maintaining  a  positive  definite  projected  Hessian.  At  a  strong  local 
minimum,  the  projected  Hessian  is  positive  definite.  If  the  projected  Hessian  is  not 
positive  definite,  a  constrained  stationary  point  is  not  a  local  minimum  and  the 
direction  defined  by  (2.9d]  is  not  necessarily  a  descent  direction.  Thus,  it  seems 
reasonable  that  the  projected  Hessian  of  our  working  set  should  be  as  positive 
definite  as  possible.  QPSFA  maintains  a  projected  Hessian  that  has  at  most  one 
nonpositive  eigenvalue.  With  this  restriction  on  the  projected  Hessian,  it  is  possible 
to  compute  a  feasible  direction  of  descent  for  the  quadratic  objective  function  or 
a  feasible  non-ascending  direction  of  negative  curvature  whenever  the  working  set 
residuals  are  zero.  In  doing  so,  it  is  desirable  to  retain  the  maximum  amount  of 
information  in  the  present  (indefinite)  projected  Hessian,  in  order  to  preserve  the 
eflficiencies  associated  with  quadratic  programming;  therefore,  standard  techniques 
that  after  an  indefinite  matrbc  (see  Gill,  Murray  and  Wright,  1981)  are  not  suitable. 
However,  there  is  a  danger  of  substantial  numerical  instability  if  care  is  not  exercised 
in  updating  a  factorization  of  an  indefinite  matrix. 

QPSFA  will  solve  an  indefinite  QP,  provided  the  initial  projected  Hessian  is 
positive  definite.  Under  these  circumstances,  the  only  way  in  which  the  projected 
Hessian  can  become  indefinite  is  when  a  single  constraint  is  deleted  from  the  working 
set;  the  effect  of  this  change  on  the  Cholesky  factors  of  the  new  projected  Hessian 
is  that  R  is  augmented  by  a  single  column  and  the  new  last  diagonal  element  of  R, 
denoted  by  r^,  is  zero  or  undefined  (being  the  square  root  of  a  negative  number). 
However,  the  definition  (4.10)  of  the  search  direction  from  a  non-optimal  constrained 
stationary  point,  is  such  that  the  last  diagonal  element  of  R  affects  only  the  scaling 
of  Px  and  not  its  direction.  Consequently,  without  any  loss  of  generality,  we  may  use 
the  modulus  of  the  operand  when  computing  the  square  root  for  the  last  diagonal 
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element  of  R.  In  this  special  situation,  the  search  direction  defined  by  (2.9d)  is 
a  direction  of  negative  curvature.  The  last  diagonal  element  of  the  modified  R  is 
denoted  by  and  is  given  by  r|  =  —rf. 

By  altering  R  as  described  above,  we  are  conceptually  making  a  rank-one  change 
to  H.  To  see  this,  assume  that  Z^HZ  is  positive  definite.  If  a  variable  is  released 
from  its  bound  and  the  new  projected  Hessian,  2^32^  is  indefinite  (one  nonpositive 
eigenvalue),  we  could  modify  H  to  form  B,  giving 

B  =  H  +  czz^,  (4.11) 

where  z"^ 2  —  e^  (last  element  of  ej  is  one),  and  <7  is  a  positive  constant.  In  order 
to  make  the  above  alteration  to  R^  define  a,hy  a  =  —2rf.  This  implies  that  the 
modified  R^R,  given  by 

R^R  =  2^B2  =  2^B2  -t-  (4.12) 

is  positive  definite.  We  never  need  to  explicitly  find  B.  Instead,  modify  R  by 
making  its  last  diagonal  element  positive  rather  than  the  square  root  of  a  negative 
number.  If  the  modification  is  not  needed,  define  B  =  H.  This  modification  implies 
that  Z^BZ  will  always  be  positive  deSnite. 

The  following  lemma  will  prove  that  if  the  search  direction  is  calculated  using 
the  modified  R,  it  will  be  a  direction  of  negative  curvature. 

Lemma  7.  Suppose  that  r’  is  negative  when  a  constraint  is  deleted  from  the 
working  set.  If  the  wodiSed  R  given  in  (4.12)  is  used,  the  search  direction  (4.10) 
will  be  a  direction  of  negative  curvature. 

Proof: 

Pre-  and  post-multiplying  (4.12)  by  and  p*,  respectively,  and  rearranging 


terms  gives 


p12'^H2pz  =  plR'^Rpz  -  <r(p^e,)’. 


(4.13) 
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In  order  to  show  that  the  search  direction  is  a  direction  of  negative  curvature, 
the  left  side  of  (4.13)  will  be  shown  to  be  negative. 

The  terms  on  the  right  side  of  (4.13)  will  now  be  expressed  in  equivalent  forms. 
Consider  the  first  term  of  the  right  side  of  (4.13).  Using  (4.10)  gives 

vlR'^Rpz  =  (4.14) 


Now  consider  the  last  term  of  (4.13).  Denote  the  last  element  of  by  p^. 
Using  (4.10),  pt  is  given  by 

=  (4.15) 

Using  the  definition  of  <r,  equation  (4.15),  and  the  relationship  the  last 

term  of  (4.13)  can  be  written  as 


(4.16) 


Using  (4.16)  and  (4.14),  equation  (4.13)  can  be  rewritten  as 

(4.17) 

Thus,  H2pz  is  negative,  as  required.  | 

Suppose  the  current  iterate  is  a  constrained  stationary  point,  the  projected 
Hessian  is  positive  definite,  and  the  smallest  multiplier.  Ay,  has  a  value  of  zero.  If 
the  j-ih  constraint  is  deleted  and  the  projected  Hessian  is  now  indefinite,  how  do  we 
compute  a  direction  of  negative  curvature?  In  this  case,  we  shall  show  that  (4.10) 
does  not  define  a  direction  of  negative  curvature. 
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Lemma  8.  If  ||Z^<7||  =  0,  and  the  constraint  oji  >  6y  with  multiplier  Ay,  is  deleted 
from  the  working  set,  then  ||i?^5||  =  z^a^Xj. 

Proof:  Define  the  new  constraint  matrix,  Aw,  by 


(4)= 


(4.18) 


Since  the  projected  gradient  is  the  zero  vector. 


g  = 


(4.19) 


Express  the  new  projected  gradient  by 


Z^9  = 


■fT„  _  I  ^"^9  \  ^ 


(4.20) 


Using  (4.18)  and  (4.19),  express  z’^g  by 


z^)  =  z^AI.X  =  z^(AI  ay  )i  =  r’'a,Aj, 


(4.21) 


since  2  =  0.  | 

If  the  multiplier  of  the  deleted  constraint  is  tero,  Lemma  6  implies  that  ^  (4.9) 
is  zero  also.  This  implies  that  the  search  direction  is  the  zero  vector.  Thus,  the 
above  strategy  to  find  a  direction  of  negative  curvature  has  failed. 

In  order  to  compute  a  direction  of  negative  curvature  when  the  multiplier  of 
the  deleted  constraint  is  zero  and  the  new  projected  Hessian  is  indefinite,  we  set  7 
(4.9)  equal  to  one.  Thus,  7  has  a  nonzero  value,  so  Lemma  7  proves  that  the  search 
direction  is  a  direction  of  negative  curvature. 


When  we  give  7  a  nonzero  value,  we  are  conceptually  altering  c.  The  new  vector 
c  and  associated  gradient  are  defined  by  ff  =  c  +  oy,  and  g  =  g  +  aj.  In  this  event, 
the  new  projected  gradient  can  be  expressed  by 

2^g  =  2^g  +  2\  =  2%  = 

From  Lemma  6,  z^aj  is  nonzero.  Thus,  7  (4.9)  is  nonzero. 

In  QPSFA,  the  vector  Z^g  is  used  so  c  is  only  conceptually  being  modified.  In 
a  large-scale  implementation,  the  vector  Z'^g  would  not  be  available.  In  this  case, 
we  could  use  g  instead  of  g  in  the  augmented  system. 

In  practice,  when  is  small,  its  sign  is  suspect  due  to  roundoff  error.  Thus, 
it  is  better  to  determine  the  sign  of  p  from  the  requirement  ajp  >  0. 

Once  a  direction  of  negative  curvature  is  computed,  a  constraint  must  be  added 
during  the  next  iteration  (otherwise  the  objective  function  is  unbounded  below).  It 
can  be  shown  that  the  addition  of  a  new  constraint  to  the  working  set  cannot 
increase  the  number  of  negative  eigenvalues  in  the  projected  Hessian.  Suppose  that 
such  a  constraint  “exchange”  takes  place,  and  that  the  last  diagonal  element  of  R 
was  previously  altered  as  described  above  in  order  to  avoid  taking  the  square  root 
of  a  negative  number.  It  can  be  shown  that  the  arbitrary  value  resulting  from  the 
constraint  deletion  does  not  propagate  ^o  any  other  elements,  i.e.  the  factors  of 
the  “exchanged”  projected  Hessian  will  still  have  a  single  arbitrary  last  diagonal 
element.  This  result  implies  that  when  a  constraint  exchange  results  in  a  positive- 
definite  projected  Hessian,  the  last  diagonal  element  of  the  triangular  factor  must 
be  recomputed. 

There  might  appear  to  be  a  danger  of  numerical  instability  in  allowing  an 
indefinite  projected  Hessian.  Certainly  the  occurrence  of  an  undefined  quantity 
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during  the  calculation  of  R  implies  that  the  usual  bound  on  growth  in  magnitude 
in  the  elements  of  R  from  the  positive-definite  case  (see  Gill,  Murray  and  Wright, 
1981)  does  not  apply.  However,  after  a  sequence  of  constraint  exchanges,  it  is 
possible  only  for  the  fast  row  of  R  to  be  •contaminated"  by  growth.  It  is  possible 
to  monitor  the  error  in  the  last  row  and  if  need  be  it  can  be  recomputed  as  soon  as 
a  positive-definite  projected  Hessian  is  obtained. 

This  result  justifies  tolerating  a  very  limited  form  of  indefiniteness,  as  described 
above.  However,  the  overall  viewpoint  of  the  algorithm  is  that  the  projected  matrix 
should  be  kept  “as  positive  definite  as  possible".  Therefore,  once  the  projected 
Hessian  is  indefinite,  no  further  constnintB  are  deleted  until  enough  constraints 
have  been  added  so  that  the  projected  Hessian  has  become  positive  definite. 

4.6.2.  Initial  projected  Hessian.  The  treatment  of  indefiniteness  in  the  pro¬ 
jected  Hessian,  described  in  the  previous  section,  is  based  on  the  assumption  that  the 
initial  projected  Hessian  is  positive  definite.  If  J7  is  indefinite,  the  initial  projected 
Hessian  (based  on  the  initial  working  set)  will  not  aecessirily  be  positive  definite. 
This  section  will  describe  the  strategy  used  to  determine  an  initial  projected  Hes¬ 
sian  that  is  positive  definite.  For  details  on  how  this  strategy  was  developed,  see 
Gill  et  ai.,  1985.  The  strategy  is  based  on  the  observation  that  the  projected  Hes¬ 
sian  matrix  will  be  positive  definite  if  enough  constraints  are  included  in  the  initial 
working  set.  (The  null  matrix  is  positive  definite  by  definition,  corresponding  to 
the  case  when  An-  contains  n  constraints.)  This  suggests  somehow  adding  artiScial 
constraints  to  the  working  set  (that  are  deleted  later)  to  make  the  projected  Hessian 
positive  definite. 


One  strategy  is  to  add  as  many  general  constraiints  to  the  working  set  as  neces¬ 
sary  to  create  a  temporary  vertex,  thus  ensuring  that  the  initial  projected  Hessian 
will  be  positive  definite.  This  strategy  is  particularly  attractive  because  it  requires 
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DO  further  work  to  update  the  TQ  factonzatiOD  or  to  compute  the  projected  Hessian. 

The  computation  proceeds  as  follows.  At  the  beginning  of  the  QP  phase,  the 
working  set  Aw  and  its  TQ  factorisation  (4.1)  are  available.  The  matrix  Z'^HZ 
is  formed,  and  the  Cholesky  procedure  with  symmetric  interchanges  is  initiated. 
Recall  that  the  Cholesky  procedure  without  interchanges  will  break  down  if  the 
matrix  is  not  positive  definite.  However,  by  performing  interchanges  (such  that  the 
column  with  largest  positive  diagonal  element  is  processed  next  at  each  step),  we 
can  identify  the  largest  possible  positive-definite  principal  minor. 

In  algebraic  terms,  assume  that  a  permutation  matrix  P  has  been  chosen  so  that 
the  upper  left  submatrix  of  P^Z^HZP  is  positive  definite.  In  effect,  the  columns 
oi  ZP  are  partitioned  as  ZP  =  {  Zi  Z^  ),  such  that  ZjHZy  is  positive  definite, 
i.e., 

ZjHZ,  =  RlR,. 

A  working  set  for  which  Zi  defines  the  null  space  can  be  obtained  by  including 
the  rows  of  Zlf  as  temporary  general  constraints.  After  P  is  determined  (by  the 
Cholesky  procedure),  the  columns  of  Z  are  reordered  (i.e.,  Z  is  replaced  by  ZP); 
note  that  the  properties  of  Z  as  a  basis  for  the  null  space  of  Aw  are  unaffected 
by  its  column  ordering.  The  minimization  of  the  quadratic  function  then  proceeds 
within  the  subspace  defined  by  Zi . 

We  discuss  here  only  the  case  when  Q  is  orthogonal.  (For  details  about  the 
case  when  Q  is  a  product  of  stabilized  elementary  transformations,  see  Gill  et  al., 
1985.)  The  temporarily  augmented  working  set  is  given  by 


(4.22) 


SO  that  p  will  satisfy  AwP  =  0  and  Z^p  =  0.  By  definition  of  the  TQ  factorization. 


Aw  automatically  satisfies  the  following: 

where 


and  hence  the  TQ  factorization  of  (4.22)  is  free. 

The  implementation  of  this  procedure  involves  several  subtle  points.  The  matrix 
Z2  need  not  be  kept  fixed  at  its  initial  value,  since  the  role  of  the  extra  constraints  is 
purely  to  define  an  appropriate  null  space;  the  TQ  factorization  can  therefore  be  up¬ 
dated  in  the  normal  fashion  as  the  iterations  of  the  quadratic  programming  method 
proceed.  No  work  is  required  to  “delete"  the  temporary  constraints  associated  with 
Z2  when  Z[g  =  0,  since  this  simply  involves  repartitioning  Q.  When  deciding  which 
constraint  to  delete,  the  multiplier  vector  associated  with  the  rows  of  Zj  is  given  by 
Zjg,  and  the  multipliers  corresponding  to  the  rows  of  the  “true"  working  set  Aw  are 
the  least-squares  multipliers  that  would  be  obtained  if  the  temporary  constraints 
were  not  present  (see  Gill  et  ai,  1985). 

QPSFA  starts  with  a  positive  definite  projected  Hessian  and  keeps  it  as  positive 
definite  as  possible.  The  single-phase  method  of  Chapter  Three  is  a  more  general 
method  that  does  not  impose  this  requirement.  We  could  make  the  single-phase 
method  coincide  with  QPSFA  by  starting  it  with  a  positive  definite  projected  Hes¬ 
sian  (a  vertex  if  necessary)  and  using  the  same  strategy  that  QPSFA  uses  to  keep 
the  projected  Hessian  as  positive  definite  as  possible.  By  doing  this,  the  single-phase 
method  and  QPSFA  would  handle  indefiniteness  in  the  same  way.  However,  to  start 
with  a  positive  definite  projected  Hessian,  the  single-phase  method  using  the  La- 
grangian  method  to  solve  the  augmented  system,  would  add  artificial  bounds.  The 
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single-phase  method  could  not  use  the  above  QPSFA  strategy  of  adding  rows  of  Z 
as  temporary  general  constraints,  since  Z  is  not  available  in  a  Lagrangian  method. 

4.6.  Constraint  deletion  strategies 

4.0.1.  Introduction.  In  QPSFA,  the  constraint  exchange  accomplished  when 
the  SR  finds  a  suitable  constraint  to  delete,  can  be  viewed  as  a  constraint  deletion 
and  a  constraint  addition.  Thus,  constraints  are  added  in  two  circumstances  and 
deleted  in  two  circumstances  (see  Section  4.9).  The  strategies  used  for  adding  con¬ 
straints  in  both  circumstances  are  identical  and  straightforward.  If  a  step  of  unity 
can  be  taken,  without  hitting  a  constraint,  no  constraint  is  added.  Otherwise,  the 
constraint  added  is  the  closest  one  to  the  current  iterate  along  the  search  direction. 
When  deleting  constraints  there  are  several  strategies  possible.  These  strategies 
differ  between  the  two  circumstances  and  are  not  as  straightforward.  In  the  first 
circumstance,  the  SR  determines  if  there  is  a  suitable  constraint  to  delete  (see  Sec¬ 
tion  4.6.4).  If  there  is  more  than  one  suitable  constraint,  QPSFA  must  determine 
which  one  should  be  deleted  (see  Section  4.6.5).  It  must  also  decide  if  more  than 
one  constraint  should  be  deleted  (see  Section  4.6.6).  In  the  second  circumstance, 
the  constraint  with  the  smallest  negative  multiplier  is  deleted.  If  more  than  one 
multiplier  is  negative,  the  algorithm  must  decide  if  more  than  one  constraint  should 
be  deleted  (see  Section  4.6.6).  The  algorithm  allows  constraints  to  be  deleted  be¬ 
fore  the  current  iterate  is  a  constrained  stationary  point  (see  Section  4.6.2).  Also, 
even  if  the  the  current  iterate  is  infeasible,  constraint  deletions  can  be  made  (see 
Section  4.6.3).  QPSFA  maintains  a  projected  Hessian  with  at  most  one  nonposi¬ 
tive  eigenvalue.  If  the  projected  Hessian  is  indefinite  (one  nonpositive  eigenvalue) 
and  a  constraint  is  deleted,  the  new  projected  Hessian  might  have  more  than  one 
nonpositive  eigenvalue.  Thus,  the  algorithm  requires  that  the  projected  Hessian  be 
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positive  definite  before  departing  the  inner  repeat  loop  and  deleting  a  constraint  on 
the  basis  of  a  negative  multiplier. 

Some  of  the  constraint  deletion  strategies  discussed  in  this  section  do  not  affect 
the  convergence  of  QPSFA.  However,  they  are  included  for  the  following  reason. 
As  constraints  are  added  to  the  working  set,  the  search  direction  becomes  less  de¬ 
pendent  on  the  quadratic  objective  function.  If  enough  constraints  are  added,  the 
working  set  determines  a  vertex.  If  this  happens,  the  search  direction  is  the  vector 
from  the  current  iterate  to  the  vertex,  and  is,  therefore,  completely  independent 
of  the  objective  function.  This  would  make  the  algorithm  similar  to  a  two-phase 
method,  where  the  first  phase  consists  of  finding  a  feasible  point  by  calculating  the 
search  directions  to  successive  vertices.  By  allowing  more  constraint  deletions  to 
occur,  the  working  set  will  be  less  likely  to  determine  a  vertex.  Thus,  the  search 
direction  will  be  more  dependent  on  the  objective  function  than  if  the  additional 
deletions  did  not  occur.  These  additional  deletion  strategies  are  described  in  Sec¬ 
tions  4.6.2  and  4.6.3. 

4.6.2.  Constraint  deletion  before  finding  manifold  minimum.  Consid¬ 
erable  effort  has  been  made  in  many  branches  of  mathematical  programming  to 
determine  the  “best"  choice  of  constraints  to  be  deleted  (i.e.  that  which  results  in 
the  least  number  of  iterations  on  average).  Experimental  results  are  mixed,  but 
intuitively  a  strategy  based  on  maximizing  the  likely  reduction  in  the  objective 
function  would  seem  to  be  superior  (see  Gill  and  Murray,  1979).  Unfortunately,  the 
eifort  to  implement  such  a  strategy  by  known  techniques  is  prohibitively  expensive 
for  almost  all  types  of  problems  (including  QP),  even  assuming  an  optimistic  view 
of  the  reduction  in  the  number  of  iterations.  Hence,  many  methods  choose  the 
constraint  with  the  most  negative  multiplier  as  the  constraint  to  delete. 

Ma'^y  active-set  methods  also  adopt  the  strategy  of  waiting  until  the  current 
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iterate  is  a  rainimum  on  a  manifold  before  deleting  any  constraint.  If  there  is  more 
than  one  negative  multiplier  at  the  minimum  on  a  manifold,  deleting  a  constraint 
corresponding  to  any  such  multiplier  will  give  convergence  in  a  finite  number  of 
iterations  The  active-set  method  presented  here  will  adopt  the  strategy  of  deleting 
the  constraint  with  the  most  negative  multiplier,  but  may  delete  constraints  at 
points  other  than  constrained  stationary  points. 

If  a  constraint  with  a  negative  multiplier  is  deleted,  we  would  like  the  new 
search  direction  to  be  a  feasible  direction  with  respect  to  the  deleted  constraint  and 
to  be  a  descent  direction.  This  is  in  fact  true,  whether  or  not  the  current  iterate  is 
a  constrained  stationary  point. 

Theorem  4.  Suppose  that  (2.14)  equals  sero  and  Z'^HZ  is  positive  deBnite. 
Assume  that  when  (2.14)  is  solved,  the  j-th  multiplier  is  negative.  If  constraint  j 
is  deleted,  and  the  new  search  direction  is  denoted  by  p,  then  ajp  is  positive  and 
g^p  is  negative. 

Proof:  Z^HZ  is  positive  definite.  When  a  constraint  is  deleted  from  the  working 
set,  express  Z  by  Z  =  {Z  z).  If  the  new  projected  Hessian,  Z^HZ,  is  indefinite, 
R  is  modified.  In  this  case,  we  are  conceptually  changing  H  to  B,  such  that  Z'^BZ 
is  positive  definite  (see  Section  4.5.1). 

Let  Am'  denote  Aw  with  the  j-th  row  omitted.  Similarly,  denote  p  with  the 
;-th  multiplier  omitted,  by  p.  Let  the  vectors  p  and  p  solve  the  augmented  system 
(2.14),  rewritten  as 


Now,  delete  the  ;-th  constraint  from  Aw,  and  let  p  and  p  solve  the  augmented 
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system  given  by 


it 


The  first  rows  of  (4.23)  and  (4.24),  respectively,  give 

Bp  -  Alii  -  ajpj  =  -g, 


(4.24) 


(4.25) 


Bp-  Alfi  =  -g. 


(4.26) 


K  2^H2  is  indefinite,  modify  H  to  give  B,  using  equation  (4.19).  Adding 
azz^p  to  both  sides  of  (4.25),  gives 


Bp  -  Alii  -  Ojiij  =  -g  +  (TZZ^p. 


(4.27) 


Since  the  vector  p  can  be  written  as  p  =  Zpg  and  z^Z  is  sero,  the  product  z^p 
is  zero.  Thus,  (4.27)  can  be  simplified  to 


Bp  -  Alp  -  ajp.  = 


(4.28) 


Let  p  =  p  -  p  and  p  =  fi  -  ii.  Subtract  (4.28)  from  (4.26),  and  premultiply  by 


p^  to  form 


p^Hp  -  P^AIp  +  =  0. 


(4.29) 


Equations  (4.23)  and  (4.24)  imply  that  A^  p  equals  zero.  This  implies  that  p 
can  be  written  as  p  =  Zpz  for  some  pg.  Also,  Cyp  equals  zero,  so  (4.29)  can  be 


written  as 


pI{Z'^BZ)Pz  +  p’^ajp.j  =  0. 


(4.30) 


Z^BZ  is  positive  definite  and  pj  is  negative  so  (4.30)  implies  that  a^p  is  positive. 


Next,  premultiply  (4.26)  by  p^  giving 


p'^Bp  -  p^Ap^  p.  =  -p'^g. 


(4.31) 
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A^  p  equals  zero  and  p  can  be  written  sts  p  =  2pg.  Along  with  (4.31),  these 
equations  imply  that 

PI{2'^B2)P^  =  -p^g.  (4.32) 

2^H2  is  positive  definite  so  (4.32)  implies  that  p^g  is  negative.  | 

Theorem  4  states  that  deleting  a  constraint  with  a  negative  multiplier  before 
the  projected  gradient  vanishes  will  result  in  a  new  search  direction  that  is  a  de¬ 
scent  direction  and  a  feasible  direction  with  respect  to  the  deleted  constraint.  We 
shall  now  develop  the  specific  strategy  of  deleting  constraints  before  the  projected 
gradient  vanishes.  (For  other  strategies  see  Gill  and  Murray,  1974,  and  Gould, 
1982.) 

Suppose  that  the  y-th  constraint  has  the  most  negative  multiplier.  Two  possible 
extreme  strategies  for  deleting  the  y-th  constraint  follow: 

(  i)  Delete  the  y-th  constraint  only  if  the  projected  gradient  is  zero. 

(it)  Delete  the  y-th  constraint  whenever  its  multiplier,  fSj,  is  negative. 

If  the  projected  gradient  is  near  zero  and  fij  is  negative  and  has  a  “large"  mag¬ 
nitude,  it  seems  likely  the  objective  function  could  be  reduced  more  by  deleting  the 
y-th  constraint.  If  the  projected  gradient  is  “large"  and  fSj  has  a  “small"  magnitude, 
it  seems  unlikely  that  deleting  the  j'-th  constraint  would  result  in  a  significant  reduc¬ 
tion  in  the  objective  function  over  that  achieved  by  not  deleting  the  j'-th  constraint. 
Moreover,  since  a  large  step  usually  implies  additional  constraints  become  active,  it 
may  be  the  multiplier  of  the  y-th  constraint  changes  sign  in  subsequent  iterations. 
Thus,  a  strategy  that  uses  the  relative  magnitudes  of  Hj  and  the  projected  gradient, 
would  appear  to  be  better  than  either  (i)  or  (it)  separately. 

Let  I  denote  the  current  iterate,  t  denote  the  minimum  of  the  current  manifold, 
and  z  denote  the  minimum  of  the  current  manifold  with  constraint  j  deleted.  Denote 
the  value  of  the  objective  function  at  i  by  F(z).  Let  g  denote  the  gradient  at  i,  and 
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S  denote  the  gradient  at  £.  Finally,  define  the  following  search  directions:  p  =  £—x, 
p  =  X  —  X,  and  p  =  x  —  i. 

Following  are  expressions  for  the  differences  in  objective  values  between  i  and 
X,  and  between  £  and  x: 


F(t)  -  F(x)  = 

(4.33) 

F(i)  -  f(J)  = 

(4.34) 

The  term  p'^g  is  readily  available,  but  p'^§  is  not.  We  shall  now  derive  a  different 
expression  for  p'^g  which  can  be  computed  easily.  Since  equals  A^  p  is 

zero,  and  A^  p  equals  ajp  times  the  j-th  unit  vector,  the  product  g^p  can  be  written 

as 

g^p  =  m^A^P  =  MjOjP-  (4-35) 

Let  Z  correspond  to  the  current  manifold  and  Z  to  the  current  manifold  ex¬ 
cluding  the  ;-th  constraint.  Define  the  vector  z  by  Z  =  ( Z  z).  The  vector  p  can 
be  written  as  Zpz  for  some  p^.  Also,  by  the  definition  of  Z,  aJZ  equals  zero. 
Express  ajp  by 

ajp  =  ajZpj,  =  ajzp,  (4.36) 

where  p  is  the  last  component  of  pz- 

Using  (4.35),  (4.36),  and  (4.9)  in  order,  implies  that  p^g  can  be  expressed  as 
p^g  =  fiyajp  =  (4.37) 

where  £  is  a  positive  constant. 

The  decrease  in  objective  value  from  the  current  iterate  to  the  minimum  on  the 
current  manifold  is  given  by  —^p^g-  (See  Fletcher,  1980,  for  the  case  where  the 
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QP  is  positive  definite  and  does  not  have  any  general  constraints.)  The  decrease 
from  the  current  iterate  to  the  minimum  on  the  current  manifold  excluding  the  j-th 
constraint  is  given  by  -^p^g  -  S/ij.  Calculating  6  for  each  constraint  is  computa¬ 
tionally  expensive.  Also,  even  if  6  were  calculated,  the  above  reductions  may  not 
be  achieved  since  £  and  x  may  violate  an  inactive  constraint.  Thus,  the  following 
heuristic  strategy  will  be  used  in  QPSFA. 

If  constraint  j  has  the  most  negative  multiplier,  it  will  be  deleted  if 

T 

—  <e,  (4.38) 

where  0  is  a  positive  parameter.  Thus,  if  the  projected  gradient  is  small  relative  to 
the  magnitude  of  the  smallest  multiplier  and  9,  the  corresponding  constraint  will 
be  deleted.  In  Chapter  Five  we  report  results  for  different  values  of  9.  Note,  that 
if  9  is  set  equal  to  zero,  this  strategy  is  the  same  as  (i).  If  ^  is  large  enough,  this 
strategy  is  the  same  as  (ti). 

In  Section  4.5.1,  the  assumption  was  made  that  the  current  iterate  was  a  non- 
optimal  constrained  stationary  point.  In  this  case,  we  can  delete  a  constraint  and 
find  a  direction  of  negative  curvature,  since  Z^HZ  is  indefinite.  QPSFA  allows 
for  deletions  when  not  at  a  constrained  stationary  point.  In  this  case.  Theorem  4 
has  shown  that  the  search  direction  is  a  descent  direction.  If  Z^H2  is  indefinite 
after  deleting  a  constraint  other  than  at  a  constrained  stationary  point,  the  search 
direction  is  not  necessarily  a  direction  of  negative  curvature,  so  the  step  length  is 
limited  to  unity.  If  a  constraint  is  not  hit  prior  to  a  step  of  unity,  Z  is  unchanged 
and  Z’^HZ  remains  indefinite.  However,  the  next  search  direction  is  a  direction  of 
negative  curvature  along  which  the  step  length  is  not  limited. 

4.6.3.  Constraint  deletion  at  an  infeasible  point.  In  general,  if  a  residual 
of  any  constraint  in  the  working  set  is  nonzero,  and  a  constraint  is  deleted  (even 
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one  with  a  negative  multiplier),  there  is  no  guarantee  that  the  search  direction  will 
be  a  descent  direction  and  move  feasibly  with  respect  to  the  deleted  constraint. 
(Theorem  4  requires  that  the  residuals  in  the  working  set  equal  sero.)  Thus,  the 
same  constraint  that  was  deleted  could  be  added  again  before  the  current  iterate 
changes,  and  cycling  could  result. 

Note  that  the  working  set  may  not  contain  aU  of  the  general  constraints  that 
are  violated  at  a  given  point.  However,  even  though  the  current  iterate  is  infeasible, 
constraints  can  be  deleted  as  long  as  the  working  set  residual  rw  is  zero. 

Theorem  S.  If  constraints  with  negative  multipliers  are  deleted  at  points  where 
rw  =  0,  QPSFA  will  converge  to  a  feasible  point. 

Proof:  We  shall  show  that  QPSFA  cannot  return  to  the  same  point,  and  thus  will 
converge  to  a  feasible  point. 

Theorem  4  states  that  the  search  direction  is  a  feasible  direction  with  respect 
to  the  deleted  constraint.  If  afp  is  positive  for  all  violated  general  constraints,  a  is 
positive.  This  implies  that  each  nonzero  residual  is  decreased.  Since  the  residual 
of  each  constraint  is  nonincreasing,  the  algorithm  cannot  return  to  the  same  point. 
If  ajp  is  nonpositive  for  some  violated  general  constraint,  the  constraint  is  added 
to  the  working  set.  This  constraint  has  a  nonzero  residual,  so  the  new  has 
a  nonzero  component.  Since  the  new  rw  is  nonzero,  the  algorithm  cannot  delete 
another  constraint  until  this  new  rw  is  the  zero  vector  again.  Again,  since  the 
residual  of  each  constraint  is  nonincreasing,  the  algorithm  cannot  return  to  the 
same  point.  | 

If  rw  is  not  zero  and  a  constraint  with  a  negative  multiplier  is  deleted,  there  is 
no  guarantee  that  the  search  direction  is  a  descent  direction  oris  a  feasible  direction 
with  respect  to  the  deleted  constraint.  However,  if  rw  is  nonzero,  a  constraint  with 
a  negative  multiplier  is  deleted,  and  the  resulting  2^H2  is  positive  definite,  the 
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search  direction  will  be  a  feasible  direction  with  respect  to  the  deleted  constraint. 
(Note  that  the  definiteness  of  2’^H2  can  be  determined  only  after  the  constraint  has 
been  deleted.)  The  objective  value  at  the  minimum  of  the  current  working  set  with 
a  constraint  deleted  is  less  than  the  objective  value  at  the  minimum  of  the  current 
working  set.  However,  both  of  these  values  may  be  greater  than  the  objective  value 
at  the  current  iterate,  since  it  is  infeasible.  Thus,  in  both  cases,  the  search  direction 
is  not  necessarily  a  descent  direction.  However,  it  still  may  be  efficient  to  delete 
constraints  in  the  above  circumstance. 

Theorem  6.  Assume  that  the  j-tb  multiplier  is  negative  when  (2.14)  is  solved  for 
p  and  fi.  Let  p  and  2^H2  denote  the  search  direction  and  projected  Hessian  when 
constraint  j  is  deleted.  If  constraint  j  is  deleted  and  2^H2  is  positive  definite, 
then  ajp  is  positive. 

Let  Aw  denote  Aw  with  the  j-th  row  omitted.  Similarly,  let  p  denote  p  with 
the  y-th  multiplier  omitted,  and  fw  denote  fw  with  the  j-th  component  omitted. 
Let  the  vectors  p  and  p  solve  the  augmented  system  (2.14),  rewritten  as 


( 


H 

Aw 


Now,  delete  the  j-th  constraint  from 
system  given  by 


I 


(4.39) 


and  let  p  and  p  solve  the  augmented 


Al 


(4.40) 


The  first  rows  of  (4.39)  and  (4.40),  respectively,  give 
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Hp-Al(i  =  -g.  (4.42) 

Let  p  =  p  —  p  and  fi  =  fi  —  fi.  Subtract  (4.41)  from  (4.42),  and  premultiply  by 
to  form 

p'^Hp  -  p^Al  fi,  +  p\  fij  =0.  (4.43) 

Equations  (4.39)  and  (4.40)  imply  that  A^p  equals  zero.  This  implies  that  p 
can  be  written  as  p  =  2pz  for  some  p*.  Equation  (4.43)  can  now  be  written  as 

pli2'^H 2)Pz  +  P^ajPj  -  p^ajpj  =  0.  (4.44) 

The  matrix  2‘^H2  is  positive  definite,  /xy  is  negative,  and  p^ay  is  positive,  so  (4.44) 
implies  that  ajp  is  positive.  | 

If  a  constraint  with  a  negative  multiplier  is  deleted  and  2'^H2  is  indefinite,  ojp 
could  be  negative.  This  might  cause  the  constraint  to  be  immediately  added.  In 
this  event,  cycling  could  occur.  Thus,  the  work  to  delete  the  constraint,  update  the 
factorizations  (discovering  that  2'^H2  is  indefinite),  and  then  immediately  add  the 
constraint  and  regain  the  previous  factorizations  is  work  that  would  not  have  been 
required  if  the  constraint  had  not  been  deleted.  If  is  negative  definite,  2^H2 
will  always  be  indefinite,  li  H  is  indefinite,  2^H2  might  be  indefinite.  The  only 
way  that  2^H2  will  always  be  positive  definite  is  if  H  is  positive  definite.  Because 
of  the  potential  for  wasted  effort,  this  strategy  will  only  be  implemented  when  H  is 
known  to  be  positive  deSnite. 

4.6.4.  Choosing  a  constraint  to  delete  using  the  SR.  The  SR  will  deter¬ 
mine  a  suitable  constraint  to  delete  if  P  is  positive  for  any  fixed  constraint  (see 
Section  3.2.1).  If  is  positive  for  more  than  one  constraint,  any  of  the  suitable 
constraints  could  be  deleted.  However,  we  would  like  to  choose  the  “best"  con¬ 
straint  to  delete.  It  is  more  difficult  to  describe  “best”  in  this  context  than  when 
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discussing  multipliers,  since  QPSFA  searches  simultaneously  for  an  optimal  and  a 
feasible  point.  One  good  strategy  would  be  to  find  a  constraint  to  delete  such  that 
the  new  search  direction  could  take  a  full  step  and  satisfy  all  constraints  in  the 
working  set.  Such  a  constraint  may  not  exist.  Moreover,  such  a  strategy  would  be 
prohibitively  expensive  to  implement.  However,  a  heuristic  based  on  this  approach 
may  be  implemented.  This  heuristic  strategy  chooses  the  constraint  corresponding 
to  the  smallest  positive  Thus,  the  new  search  direction  is  “shortest”  and  the 
probability  of  hitting  a  constraint  is  “minimized”. 

Another  strategy  is  to  choose  from  the  suitable,  fixed  constraints  the  one  with 
the  most  negative  multiplier.  However,  when  the  SR  is  used  the  current  multipliers 
are  not  usually  available.  Thus,  in  order  to  test  this  strategy,  (2.9d)  must  be  solved 
each  time  the  SR  is  used. 

A  third  strategy  is  to  choose  from  the  suitable  fixed  constraints  the  one  with 
the  smallest  ratio  of  the  multiplier  to  the  corresponding  p.  This  combines  the  first 
two  strategies  by  choosing  a  constraint  with  a  small  positive  p  and  a  large  negative 
multiplier. 

4.6.5.  Multiple  constraint  deletions.  Constraints  are  deleted  in  QPSFA  in 
two  circumstances.  In  the  first  circumstance,  the  SR  finds  a  suitable  constraint  to 
delete  to  avoid  singularity.  In  the  second  circumstance  a  constraint  with  a  negative 
multiplier  is  deleted.  If  there  are  multiple  permissible  choices  of  constraints  to  delete 
in  either  circumstance,  we  may  delete  more  than  one  constraint.  If  more  than  one 
constraint  is  deleted,  the  decision  as  to  which  is  the  “best”  set  to  delete  is  very 
difficult. 

It  was  shown  in  Section  4.6.2  that  if  a  constraint  with  a  negative  multiplier 
is  deleted  when  the  working  set  residual  is  zero,  the  search  direction  is  a  descent 
direction  that  moves  feasibly  with  respect  to  the  deleted  constraint.  If  more  than 
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one  constraint  with  a  negative  multiplier  is  deleted,  the  search  direction  is  still  a 
descent  direction.  However,  it  does  not  necessarily  move  in  a  feasible  direction  with 
respect  to  all  of  the  deleted  constraints.  This  implies  that  a  deleted  constraint  could 
be  bit  and  immediately  added  back  to  the  working  set.  Thus,  the  multiple-deletion 
strategy  could  result  in  more  computation  time.  If  multiple  deletions  occur  and  the 
search  direction  does  move  feasibly  with  respect  to  all  of  the  deleted  constraints, 
the  algorithm  has  calculated  only  one  search  direction  and  one  set  of  multipliers, 
while  "successfully”  deleting  several  constraints.  If  the  deletions  had  been  made 
one  at  a  time,  the  algorithm  would  have  calculated  a  search  direction  and  a  set  of 
multipliers  for  each  deletion.  Thus,  the  multiple-deletion  strategy  could  result  in 
less  computation  time.  The  effect  that  the  multiple-deletion  strategy  has  on  the 
computation  time  depends  on  the  particular  problem. 

An  analogous  dependence  occurs  when  multiple  deletions  are  made  on  the  basis 
of  the  SR.  We  saw  in  Section  3.1  that  if  the  SR  finds  a  suitable  constraint  to  delete, 
the  search  direction  is  a  feasible  direction  with  respect  to  the  deleted  constraint. 
If  more  than  one  suitable  constraint  is  deleted,  the  search  direction  does  not  nec¬ 
essarily  move  in  a  feasible  direction  with  respect  to  all  of  the  deleted  constraints. 
This  implies  that  a  deleted  constraint  could  be  bit  and  immediately  added  back 
to  the  working  set.  As  in  the  previous  paragraph,  this  implies  that  the  effect  on 
computation  time  is  dependent  on  the  particular  problem. 

Indefiniteness  of  the  projected  Hessian  could  prevent  multiple  deletions.  If 
the  SR  is  used  and  a  suitable  constraint  is  deleted,  Z^HZ  remains  the  same  (see 
Lemma  7).  If  any  additional  suitable  constraints  are  deleted,  the  projected  Hessian 
could  have  more  than  one  nonpositive  eigenvalue.  Thus,  we  may  be  restricted  from 
deleting  more  than  one  constraint  by  the  requirement  that  the  projected  Hessian 
has  at  most  one  nonpositive  eigenvalue.  Similarly,  if  a  constraint  with  a  negative 
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multiplier  is  deleted,  the  projected  Hessian  could  become  indefinite.  If  any  addi¬ 
tional  constraints  with  negative  multipliers  are  deleted,  the  projected  Hessian  could 
have  more  than  one  nonpositive  eigenvalue.  QPSFA  maintains  a  projected  Hes¬ 
sian  with  at  most  one  nonpositive  eigenvalue.  If  the  projected  Hessian  is  indefi^.te, 
constraints  can  not  be  deleted  (exchanges  using  the  SR  are  permissible)  until  the 
projected  Hessian  is  positive  definite  again.  Multiple  constraint  deletions  might 
result  in  a  projected  Hessian  with  more  than  one  nonpositive  eigenvalue.  Thus,  if 
multiple  constraint  deletions  are  allowed,  the  definiteness  of  the  projected  Hessian 
must  be  determined  after  each  deletion.  Constraint  deletions  must  not  be  allowed  if 
they  would  result  in  a  projected  Hessian  with  more  than  one  nonpositive  eigenvalue. 
Thus,  the  work  to  determine  if  the  new  Z^HZ  has  more  than  one  nonpositive  eigen¬ 
value,  along  with  the  work  to  return  to  the  factorization  of  Z^HZ,  is  work  that 
would  not  have  been  done  if  the  multiple-deletion  strategy  were  not  used. 

The  determination  of  the  “best"  single  constraint  to  delete  is  not  performed 
because  it  is  considered  to  be  too  computationally  expensive.  Thus,  the  heuristic 
rule  of  choosing  the  constraint  with  the  largest  multiplier  or  the  smallest  positive 
P  is  used.  The  determination  of  the  “best”  set  of  constraints  to  delete  is  certainly 
more  difficult  than  choosing  the  best  single  constraint.  Therefore,  some  heuristic 
rule  must  apparently  be  used  again. 

The  primary  objection  to  the  multiple-deletion  strategy  is  that  a  deleted  con¬ 
straint  might  immediately  be  hit  and  added  back  to  the  working  set.  Because  of 
this  objection  and  the  other  difficulties  discussed  in  this  section,  we  only  consider 
the  case  of  deleting  a  single  constraint. 
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4.7.  ResolTition  of  singularity  in  QPSFA 

Singularity  in  K  can  occur  either  when  a  variable  is  deleted  or  added  to  the  working 
set.  If  singularity  occurs  when  a  variable  is  deleted,  the  new  projected  Hessian  is 
singular.  In  this  event  (and  when  the  projected  Hessian  becomes  indefinite),  we  shall 
use  the  strategy  of  altering  R — conceptually  changing  H  to  S — that  was  described 
in  Section  4.5.1. 

If  singularity  occurs  when  a  variable  is  added  to  the  working  set,  the  SR  will 
be  used.  The  SR  derived  in  Section  3.1  assumes  that  the  Lagrangfan  method  will 
be  used  to  solve  a  QP  with  general  constraints  that  are  equalities.  We  shall  now 
formulate  a  different  derivation  of  the  SR,  one  that  is  suitable  for  use  with  QPSFA — 
a  projection  algorithm  with  general  constraints  that  are  inequalities.  Since  the 
implementation  of  QPSFA  treats  bounds  and  general  constraints  separately,  the 
following  derivation  of  the  SR  also  treats  them  separately.  Thus,  the  derivation 
of  the  SR  will  use  the  matrix  i4rii — the  portion  of  Aw  corresponding  to  the  free 
variables. 

This  different  derivation  is  done  for  two  reasons.  First,  the  matrix  C  contains  all 
general  constraints,  so  the  calculation  of  P  (3.12)  only  involves  adding  and  deleting 
bounds.  The  implementation  of  QPSFA  involves  the  TQ  factorization  of  >4,^,  which 
does  not  contain  all  general  constraints.  Thus,  bounds  and  general  constraints  can 
be  added  and  deleted.  Second,  if  singularity  occurs  when  a  variable  is  added  to 
the  working  set,  the  constraints  are  dependent,  since  Z^BZ  is  positive  definite. 
Since  the  singularity  in  the  constraints  does  not  depend  on  the  Hessian,  the  SR  can 
be  derived  without  considering  the  Hessian.  Thus,  in  QPSFA  the  detection  and 
avoidance  of  singularity  when  a  variable  is  added  are  simplified  because  the  source 
of  singularity  is  known  to  be  in  the  constraints.  Equation  (3.12)  will  be  equivalent 
to  the  new  formula  for  P  in  the  sense  that  deletion  (addition)  of  a  slack  variable 
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from  C  is  the  same  as  adding  (deleting)  a  general  constraint  to  ArK-  First,  a  means 
to  detect  singularity  when  a  variable  is  added  to  the  working  set  will  be  described. 
Then,  an  equivalent  formulation  of  the  SR  will  be  developed. 

In  Section  3.2  singularity  in  the  new  augmented  system  using  a  Lagrangian 
method  could  be  detected  only  after  updating  the  factorization  or  solving  a  linear 
system  involving  K,  This  could  be  computationally  expensive.  Since  updates  to 
the  TQ  factorization  have  already  been  described,  the  detection  of  singularity  in 
the  new  T  is  straightforward.  Singularity  in  the  new  T  can  occur  only  if  a  general 
constraint  or  a  bound  is  added  to  the  working  set. 

If  a  general  constraint  or  a  bound  is  added  to  the  working  set,  the  new  T  would 
be  formed  by  applying  a  sequence  of  plane  rotations  to  (see  Section  4.3.1). 
These  rotations  will  transform  tuj  into  the  unit  vector  (with  the  last  component 
one)  times  a  nonzero  number.  If  tyj  is  initially  the  zero  vector,  plane  rotations 
cannot  transform  it  into  a  vector  with  a  nonzero  final  component.  This  implies  that 
if  lyj  is  the  zero  vector,  the  new  T  would  be  singular. 

If  a  general  constraint  is  being  added,  lyj  =  aJj^Z,  where  corresponds  to 
the  free  components  of  the  general  constraint  that  is  being  added.  If  a  bound  is 
being  added,  tyj  =  so  u;^  is  the  last  row  of  Z  (see  Gill  et  aJ.,  1982).  Thus, 

singularity  in  the  new  T  can  be  detected  before  updating  Q  and  T,  and  without  the 
extra  factorization  or  solve  of  a  linear  system  required  in  the  Lagrangian  method. 

Suppose  the  resulting  T  is  singular  if  a  constraint  with  index  t  were  added  to 
the  working  set.  The  SR  tries  to  find  a  constraint,  say  with  index  j,  that  if  deleted 
from  the  working  set  would  ensure  that  (o)  the  new  T  would  be  nonsingular,  and 
(6)  the  new  search  direction  would  be  a  feasible  direction  with  respect  to  the  deleted 


constraint. 
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In  order  to  do  this  the  SR  solves  the  following  system  of  equations  for  P: 


79 


where  r  denotes  r»..  Let  p  denote  the  new  search  direction,  excluding  the  j-th 
component.  The  j-th  component  of  the  new  search  direction  is  denoted  by  If 
a  bound  is  hit,  then  h  =  e,,  and  t  =  0  (forcing  the  t-th  component  of  p  to  equal 
zero).  If  a  general  constraint  is  added  to  the  working  set,  then  h  consists  of  the 
free  components  of  the  i-th  row  of  A,  and  t  is  the  corresponding  (possibly  nonzero) 
residual.  (This  forces  the  new  search  direction  to  satisfy  the  constraint  indexed  by 
i.)  If  a  general  constraint  is  deleted  from  the  working  set,  t  =  — ey.  If  a  variable 
is  released  from  its  bound,  t  consists  of  the  components  of  the  j-th  column  of  A 
corresponding  to  the  general  constraints  in  the  working  set.  If  a  general  constraint 
is  hit,  and  a  variable  is  released  from  its  bound,  c  is  the  single  element  in  the  i-th 
row  and  j-th  column  of  A\  otherwise  c  is  zero. 

Let  q  be  the  solution  of  ( Z  Y)q  =  p.  Rewrite  (4.45)  as 


Now  perform  the  indicated  multiplication  of  the  6rst  two  matrices  in  (4.46), 
and  expand  q  into  components  relating  to  Z  and  Y.  Equations  (4.46)  can  then  be 
written  as 


(ArnZ  AY 
\  IJZ  hTY 


(4.47) 


Next,  recalling  that  ArtiY  =  T,  A^Z  =  0,  and  h'^Z  =  0  (since  the  new  T  is 
singular),  equation  (4.47)  can  be  written  as 
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Hence 

Tqy  +  lp  =  -r, 

(4.49) 

and 

hZYqy  +cP  =  i. 

(4.50) 

Let  u  be  the  solution  of  h'^Y  =  u^T.  Pre- multiplying  (4.49)  by  gives 

h^Ygv  +  =  —  u^’r.  (4.51) 

From  (4.50)  and  (4.51)  we  get 

-cp  -  i  +  u'^ip  =  -u’'r.  (4.52) 


Hence 


MU-t 


(4.53) 


Equation  (4.53)  determines  a  P  for  each  constraint  in  the  working  set.  Note 
that  u  need  only  be  calculated  once  in  order  to  calculate  every  p.  If  the  denominator 
of  (4.53)  is  not  zero,  P  exists,  which  Implies  that  the  new  T  is  guaranteed  to  be 
nonsingular.  If  ^  is  positive  (assuming  j  is  on  its  lower  bound),  the  new  search 
direction  would  be  a  feasible  direction  with  respect  to  the  deleted  constraint.  Thus, 
the  SR  has  found  a  permissible  constraint  to  delete  if  any  P  has  the  proper  sign. 

In  general,  we  are  prohibited  from  deleting  constraints  when  the  projected  Hes¬ 
sian  is  indefinite.  However,  the  SR  may  have  to  be  used,  even  though  the  projected 
Hessian  is  indefinite.  We  shall  now  examine  how  the  constraint  deletion  performed 
when  the  SR  is  used,  alters  Z^BZ.  A  constraint  deletion  could  increase  the  number 
of  nonpositive  eigenvalues  of  Z^BZ.  However,  the  constraint  deletions  performed 
when  the  SR  is  used  will  not  change  Z^BZ.  (By  using  the  notation  of  Chapter  3, 
in  which  only  bounds  can  be  added  to  and  deleted  from  the  working  set,  the  proof 
of  the  following  lemma  is  simplified.) 
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Lemma  0.  If  the  SR  is  used,  and  Z^HZ  is  positive  deSnite,  then  the  old  Z  is  still 
a  satisfactory  basis  for  the  null  space  of  the  new  working  set.  Consequently,  the 
new  projected  Hessian  is  positive  dehnite. 

Proof:  Use  of  the  SR  implies  a  variable  (indexed  by  p)  was  hit  that  caused  C  to 
become  singular,  and  a  variable  (indexed  by  q)  was  found  that  when  released  from 
its  bound  made  the  new  C  nonsingular.  Since  C  became  singular  when  the  bound 
indexed  by  p  was  hit,  Lemma  3  implies  that  eJZ  =  0.  When  the  p-th  bound  is  fixed 
and  the  q-ih  bound  is  released,  the  new  C,  denoted  by  C,  is  formed  by  exchanging 
the  q-th.  column  for  the  p-th  column,  giving 

^  =  C -b  (a,  -  Op)cJ‘. 

Thus,  Z  does  not  need  to  be  altered  since 

CZ  =  iC  +  (a,  -  =  CZ+  (a,  -  Op)cJZ  =  0. 

Similarly,  the  new  H,  denoted  by  B,  can  be  expressed  as 

H  =  H  +  {h^-  hp)ej  +  Cp(hJ'  -  /ij)  - 

where  h,  and  hp  denote  columns  of  H,  and  is  the  p-th  element  of  hq  —  hp.  When 
Z'^BZ  is  formed,  we  see  that  Z'^BZ  is  equal  to  Z'^HZ.  | 
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4.8.  Infeasibility  detection  and  convergence  to  an  optimal  point 

In  the  single-phase  Lagrangian  method  it  is  not  possible  to  ascertain  immediately  if 
the  problem  is  infeasible  when  the  SR  does  not  find  a  suitable  variable  to  delete.  In 
QPSFA,  Z’^BZ  is  always  positive  definite.  Thus,  if  the  SR  does  not  find  a  suitable 
variable  to  delete,  Theorem  2  implies  that  the  problem  is  infeasible. 

Once  a  feasible  point  has  been  found,  all  future  iterates  remain  feasible.  Thus, 
the  algorithm  becomes  an  active-set  feasible-point  method.  An  outline  of  a  conver¬ 
gence  proof  is  given  below.  (In  this  proof,  it  is  required  that  a  minimum  be  found 
on  each  manifold.) 

Theorem  7.  QPSFA  will  find  an  optimal  point  (or  indicate  unboundedness  of  the 
objective),  given  an  initial  feasible  point. 


Proof:  The  inner  repeat  loop  will  find  the  minimum  on  a  manifold  in  a  finite  number 
of  iterations.  If  all  multipliers  are  nonnegative,  an  optimal  point  has  been  found. 
Otherwise,  a  constraint  with  a  negative  multiplier  is  deleted.  By  the  nondegeneracy 
assumption,  the  next  step  is  positive.  The  search  direction  is  a  descent  direction,  so 
the  objective  value  decreases.  This  implies  that  once  the  iterates  leave  a  manifold, 
they  cannot  return  to  the  same  manifold.  The  objective  value  is  monotonically 
decreasing,  so  the  algorithm  will  find  an  optimal  point,  or  a  direction  of  negative 
curvature  that  does  not  hit  any  constraint.  Furthermore,  there  are  only  a  finite 
number  of  manifolds,  so  the  algorithm  will  terminate  in  a  finite  number  of  iterations. 
I 
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Tnh\e  2 

Pseudo- Fortran  version  of  QPSFA 


pd  =  (  H  is  positive  definite  ) 

feas  =  (  iterate  satisfies  all  constraints  ) 

repeat 

repeat 

statpt  =  (  projected  gradient  equals  zero  ) 

wsnrm  =  (  norm  of  working  set  residuals  equals  zero  ) 

compute  p 

compute  second-order  multipliers 
mlrptg  =  (a  negative  multiplier  is  large  relative  to 
the  gradient  projected  along  p) 
if  (  not  ((statpt  or  mlrptg)  and  (pd  or  wsnrm) 
and  posdef  and  feas  ))  then 
compute  a 

if  (  constraint  is  hit  )  then 

if  (  T  is  singular  )  then 

if  (  SR  finds  constraint  to  delete  )  then 
exchange  constraints 

else 

error  =  tme 

end  if 

else 

add  constraint 

end  if 

end  if 


posdef  =  (  projected  Hessian  la  positive  definite  ) 
feas  =  (  iterate  satisfies  all  constraints  ) 

end  if 

nntil  (  (  (  statpt  or  mlrptg  )  and  (  pd  or  wsnrm  )  and  posdef  ) 
or  error  ) 

compute  smallest  multiplier 

optmul  =  (  smallest  multiplier  is  greater  than  zero  ) 
if  (  not  (  optmul  or  error  )  )  then 
delete  constraint 

posdef  =  (  projected  Hessian  is  positive  definite  ) 

end  if 

nntil  (  (  optmul  and  feas  )  or  error  ) 
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4.0.  Comparison  of  QPSFA  to  alternative  QP  methods 

Brief  descriptions  of  three  alternative  QP  methods  were  given  in  Section  2.6.  We 
shall  now  compare  QPSFA  to  these  three  methods.  In  terms  of  solving  an  EQP,  the 
methods  QPSFA,  QPSOL,  and  CS  are  null-space  projection  methods  which  solve 
the  augmented  system  using  (2.9).  The  residual,  r,  and  hence  the  py  term,  is  zero 
in  QPSOL,  but  not  in  QPSFA.  (Much  of  the  computer  code  used  by  QPSFA  is 
identical  to  the  code  of  QPSOL.)  GI  is  a  range-space  projection  method  which  uses 
(2.11). 

In  addition  to  the  fact  that  the  iterates  of  QPSFA  may  be  infeasible,  the  method 
is  not  dual  feasible.  In  terms  of  describing  how  the  method  controls  changes  to  the 
working  set,  this  implies  that  the  method  is  neither  an  active-set  feasible-point 
method,  nor  a  dual-feasible  active-set  method.  Once  a  violated  constraint  becomes 
satisfied,  the  method  does  not  allow  the  constraint  to  become  violated  again.  How¬ 
ever,  once  a  multiplier  becomes  positive,  the  method  does  not  attempt  to  keep  it 
positive.  In  this  sense,  the  method  is  more  similar  to  an  active-set  feasible-point 
method  than  a  dual-feasible  active-set  method.  QPSFA  is  similar  to  QPSOL  since 
it  deletes  constraints  on  the  basis  of  negative  multipliers,  and  adds  constraints  when 
they  restrict  the  step  to  the  minimum  on  the  manifold.  QPSFA  and  GI  both  add 
violated  constraints,  so  in  this  sense  they  are  similar.  (However,  QPSFA  and  GI 
use  different  criteria  for  adding  these  violated  constraints.  GI  adds  the  constraint 
that  is  most  violated  at  the  current  iterate.  QPSFA  adds  the  constraint  that  is 
most  violated  at  a  step  of  unity  along  the  search  direction.)  QPSFA  is  also  similar 
to  CS  in  that  it  can  be  described  as  an  exact  penalty  method,  where  the  penalty 
function  is  given  by 


P{x,P,p) 


if  ajp  >  0, 
if  afp  <  0, 


for  all  i  G  V 
for  any  i  G  V  ’ 
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where  p  is  sufficiently  large  and  V  denotes  the  set  of  violated  general  constraints. 
Note  that  the  summation  involves  only  the  t  constraints  in  the  working  set,  indi¬ 
cating  that  these  violated  constraints  are  the  only  ones  that  incur  a  penalty.  Of 
course,  one  could  not  use  such  a  penalty  function  in  an  algorithm,  unless  a  method 
of  determining  a  search  direction  that  would  not  incur  an  infinite  penalty  were 
given.  QPSFA  determines  such  a  search  direction,  so  it  can  be  viewed  as  using 
Due  to  the  infinite  penalty,  QPSFA  takes  steps  of  length  sero  until  it 
finds  an  appropriate  search  direction.  This  is  a  different  explanation  of  why  the 
sum  of  infeasibilities  in  QPSFA  is  a  decreasing  function. 

Each  algorithm  must  indicate  when  the  problem  is  infeasible.  When  the  SR  (in 
QPSFA)  fails  to  find  a  constraint  to  delete,  this  implies  that  it  has  found  an  infeasi¬ 
ble  subproblem.  GI  also  indicates  infeasibility  by  finding  an  infeasible  subproblem. 
Infeasibility  is  indicated  in  QPSOL  when  the  first  phase  does  not  find  a  feasible 
point.  With  a  sufficiently  large  p,  infeasibility  is  indicated  in  CS  when  the  solution 
of  the  unconstrained  problem  is  infeasible  with  regard  to  the  original  problem. 

QPSFA  and  QPSOL  handle  indefinite  QP’s  in  the  same  way — they  allow  at  most 
one  nonpositive  eigenvalue  in  the  projected  Hessian.  GI  does  not  accommodate  an 
indefinite  Hessian.  CS  allows  multiple  negative  eigenvalues  in  the  projected  Hessian. 

The  SR  is  unique  to  QPSFA.  It  is  necessary  in  QPSFA,  since  the  working  set 
could  become  singular  when  a  constraint  is  hit.  This  cannot  occur  in  an  active-set 
feasible-point  method,  such  as  QPSOL,  because  the  search  direction  will  not  inter¬ 
sect  any  dependent  constraint.  To  see  this,  assume  that  the  vector  ay  is  the  normal 
of  a  constraint  that  is  not  in  the  working  set  and  that  ay  is  a  linear  combination  of 
the  rows  of  C.  The  search  direction  p  satisfies  Cp  =  0,  and  it  follows  that  ajp  =  0. 
Thus,  p  will  not  intersect  the  dependent  constraint.  In  QPSFA,  the  search  direc¬ 
tion  satisfies  Cp  =  -r,  so  it  could  intersect  a  dependent  constraint.  GI  handles 
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singularity  in  the  working  set  when  a  constraint  is  added,  by  using  the  change  in 
the  existing  multipliers  to  determine  which  constraint  to  delete. 

The  assumption  of  nondegeneracy  is  needed  in  feasible-point  active-set  methods, 
because  of  a  problem  that  might  occur  when  a  constraint  is  deleted  from  the  working 
set  at  a  degenerate  point.  If  the  current  iterate  is  a  degenerate  point,  any  positive 
step  zdong  the  search  direction  may  violate  one  of  the  dependent  constraints  that 
was  not  in  the  working  set.  Such  a  constraint  must  then  be  added  to  the  working 
set.  If  X  is  not  a  vertex,  a  move  can  be  made  without  deleting  any  more  constraints 
from  the  working  set.  If  i  is  a  vertex,  a  constraint  must  be  dropped  from  the 
working  set.  Thus,  there  is  the  chance  that  the  sequence  of  working  sets  obtained 
by  deleting  and  adding  constraints  may  repeat  itself  after  finitely  many  steps — a 
phenomenon  known  as  cycling.  Note  that  degeneracy  is  not  a  difficulty  in  itself; 
but  when  it  is  present,  cycling  is  a  possibility. 

The  resolution  of  degeneracy  at  a  vertex,  i.e.  the  computation  of  a  search  di¬ 
rection  such  that  the  objective  function  undergoes  a  strict  decrease,  is  guaranteed  if 
enough  combinations  of  constraints  are  deleted  from  the  working  set.  However,  the 
resolution  of  degeneracy  at  a  vertex  is  essentially  a  combinatorial  problem  whose 
solution  may  require  a  significant  amount  of  computation.  Fortunately,  the  occur¬ 
rence  of  cycling  is  rare;  when  it  does  occur,  simple  heuristic  strategies  almost  always 
succeed  in  breaking  the  deadlock.  In  order  to  establish  convergence,  nondegener¬ 
acy  assumptions  are  made  in  QPSFA,  QPSOL,  and  CS.  In  GI,  a  nondegeneracy 
assumption  is  not  necessary  since  the  Hessian  is  positive  definite. 

Of  these  four  algorithms,  QPSFA  is  the  only  one  that  allows  constraints  to  be 
deleted  when  not  at  a  minimum  of  a  manifold.  However,  QPSOL  and  CS  could 
be  modified  to  allow  such  deletions.  Such  a  strategy  does  not  apply  to  GI,  since 
deletions  are  made  to  maintain  dual  feasibility. 
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All  four  of  these  algorithms  allow  for  the  deletion  of  constraints  when  the  cur¬ 
rent  iterate  is  infeasible.  However,  in  QPSOL,  this  occurs  only  in  the  first  phase. 
The  other  three  algorithms  are  single-phase  methods,  so  such  deletions  must  be 
permitted. 


Chapter  Five 
Nmnerical  Resnltf 


This  chapter  presents  computational  results  of  the  implementation  of  QPSFA 
(described  in  Chapter  Four).  The  quadratic  programs  were  randomly  generated  by 
a  method  that  allows  the  user  to  specify  the  numerical  properties  of  the  problem 
(see  Section  5.1).  Several  parameters  must  be  input  to  QPSFA  in  order  to  specify 
the  algorithm  completely.  The  results  obtained  by  using  different  combinations  of 
these  parameters  are  given  in  Section  5.2.  The  performances  of  QPSFA  and  QPSOL 
on  identical  problems  are  compared  in  Section  5.3. 

A  common  way  to  compare  algorithms  is  to  use  some  measure  of  the  amount 
of  work  performed  (e.g.,  execution  time,  number  of  iterations).  A  large  portion  of 
the  computational  effort  required  in  all  QP  methods  is  used  to  solve  the  augmented 
systems.  The  total  amount  of  work  is  roughly  proportional  to  the  number  of  solves 
of  the  augmented  system.  (An  augmented  system  must  be  solved  each  time  a 
constraint  is  added  to  or  deleted  from  the  working  set.)  Thus,  when  comparing 
different  parameter  settings  of  QPSFA,  and  QPSOL  to  QPSFA,  the  comparisons 
will  be  based  primarily  upon  the  total  number  of  solves  of  the  augmented  system. 
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6.1.  Generation  of  random  quadratic  problems 

The  most  common  objective  when  testing  a  QP  algorithm  is  to  compare  its  efficiency 
with  that  of  other  algorithms.  It  is  essential  to  be  able  to  specify  the  numerical 
properties  of  the  test  problems.  The  method  of  generating  dense  QP’s  used  in  this 
thesis  has  this  feature.  (This  method  of  generating  QP’s  is  described  by  Gould, 
1984.) 

With  this  method,  the  user  is  able  to  specify  certain  characteristics  of  the  QP. 
The  size  of  the  problem,  both  in  terms  of  the  number  of  variables  and  the  number 
of  constraints,  must  be  specified.  The  dimension  of  the  null  space  at  the  solution  is 
specified  by  inputing  the  desired  number  of  active  general  constraints  and  bounds 
at  the  solution.  The  conditioning  of  the  problem,  both  in  terms  of  the  objective 
function  and  the  constraints,  must  be  input.  Indefinite  QP’s  are  formed  by  making 
some  of  the  eigenvalues  of  the  Hessian  negative.  Linear  programs  can  be  formed 
simply  by  specifying  that  the  Hessian  is  the  null  matrix. 

By  varying  these  characteristics,  a  group  of  twenty  •typical”  problems  was 
formed.  (The  number  of  variables  was  20  or  less  and  the  number  of  general  con¬ 
straints  was  50  or  less.  The  dimension  of  the  null  space  at  the  solution  ranged 
from  zero  to  the  number  of  variables.  The  condition  numbers  of  the  Hessian  and 
the  constraints  varied  from  10*  to  10*^.  The  definiteness  of  the  problems  ranged 
from  being  positive  definite  to  almost  negative  definite.)  Using  this  group,  each 
of  the  comparisons  in  the  next  two  sections  is  based  on  varying  one  parameter  or 
characteristic  at  a  time,  while  the  others  remain  unchanged. 

Table  3  gives  the  important  characteristics  of  the  group  of  twenty  test  problems. 
By  designating  each  problem  by  a  number,  referencing  the  problem  in  other  tables 
is  greatly  simplified.  The  number  of  variables  is  denoted  by  N.  The  number  of 
constraints  is  denoted  by  NCLIN.  The  dimension  of  the  null  space  at  the  solution  is 
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denoted  by  NZ.  The  exponent  of  the  condition  number  of  the  constraints  and  of  the 
objective  function  is  denoted  by  COKD.  Finally,  the  number  of  negative  eigenvalues 
of  the  Hessian  is  denoted  by  KEGEV. 

Table  S 

Test  problem  characteristics 


Problem 

N 

KCLIN 

NZ  1 

COND 

NEGEV 

1 

10 

50 

0 

7  ' 

0 

2 

10 

50 

0 

7 

0 

3 

10 

so 

0 

7 

0 

4 

10 

50 

8 

7 

0 

5 

20 

50 

15 

7 

0 

6 

50 

20 

0 

3 

0 

7 

20 

20 

5 

3 

0 

8 

20 

20 

12 

1 

0 

9 

20 

20 

20 

3 

0 

10 

10 

50 

10 

3 

0 

11 

30 

30 

15 

3 

0 

12 

50 

50 

25 

3 

0 

13 

10 

50 

0 

7 

2 

14 

10 

50 

0 

7 

8 

15 

20 

50 

0 

3 

3 

16 

20 

50 

0 

3 

10 

17 

10 

50 

15 

3 

2 

18 

10 

50 

15 

7 

5 

19 

20 

50 

10 

7 

2 

20 

20 

50 

10 

3 

10 
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5.2.  CompariBon  of  QPSFA  constraint  deletion  strategies 

Three  parameters  involving  constraint  deletion  strategies  must  be  input  to  QPSFA 
in  order  to  specify  the  algorithm  completely.  The  computational  results  obtained 
by  varying  each  of  these  parameters  will  now  be  compared. 

The  first  parameter  is  the  logical  variable  PD.  If  the  QP  is  known  to  be  positive 
definite,  PD  is  set  to  true.  In  this  case,  QPSFA  will  allow  constraints  to  be  deleted 
from  the  working  set,  even  though  constraints  in  the  working  set  are  violated  (see 
Section  4.6.3).  If  it  is  not  known  that  the  QP  is  positive  definite,  PD  must  be  set 
to  false.  In  this  case,  QPSFA  will  allow  constraints  to  be  deleted  from  the  working 
set  only  if  all  of  the  constraints  in  the  working  set  are  satisfied. 

Setting  PD  to  true  resulted  in  0-60%  fewer  solves  of  the  augmented  system 
than  with  PD  set  to  false  (see  Table  4).  Note  that  in  this  case,  the  SR  should  be 
required  less  often  since  constraints  with  negative  multipliers  can  be  deleted  even 
though  constraints  in  the  working  set  are  violated.  Thus,  the  only  situation  that 
would  require  the  use  of  the  SR  is  when  the  current  working  set  defines  a  vertex, 
all  multipliers  are  positive,  and  the  current  iterate  is  still  infeasible.  This  situation 
occurred  only  a  few  times  in  all  of  the  test  runs  made.  Apparently,  the  strategy  of 
deleting  constraints  to  keep  the  dimension  of  the  null  space  larger — and  hence  allow 
the  Hessian  to  influence  the  determination  of  the  search  direction  more  is  effective. 
Thus,  PD  should  be  set  to  true  whenever  possible. 

The  second  parameter  is  the  real  variable  THETA,  corresponding  to  B  of  (4.46). 
The  values  of  THETA  tested  primarily  were  *ero  and  10®.  If  THETA  equals  sero,  the 
projected  gradient  must  be  zero  before  constraint  deletions  are  allowed.  If  THETA 
equals  10®,  constraint  deletions  are  allowed  before  the  minimum  on  the  manifold 
is  achieved  (see  Section  4.6.2).  When  THETA  was  set  to  values  larger  than  10®, 
the  algorithm  found  the  identical  sequence  of  iterates  as  with  THETA  set  to  10®,  in 
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almost  all  cases.  Each  problem  that  was  tested  with  values  between  tero  and  10*, 
tended  to  have  a  threshold  value  (or  small  range  of  values).  Below  this  threshold, 
the  sequence  of  iterates  found  was  the  same  as  with  THETA  equal  to  zero.  Above 
this  threshold,  the  sequence  was  the  same  as  with  THETA  equal  to  10*. 

In  positive-definite  problems,  the  best  value  of  THETA  is  strongly  correlated  to 
the  dimension  of  the  null  space  at  the  solution.  If  NZ  is  near  zero,  setting  THETA 
equal  to  zero  resulted  in  0-50%  fewer  solves  of  the  augmented  system  than  with 
THETA  equal  to  10*  (see  T^ble  4).  In  this  case,  the  solution  is  (almost)  a  vertex, 
so  maintaining  a  larger  null  space  by  deleting  constraints  when  not  at  a  minimum 
resulted  in  more  solves.  If  NZ  was  greater  than  N/4,  setting  THETA  equal  to  10* 
resulted  in  0-70%  fewer  solves  of  the  augmented  system  than  with  THETA  equal  to 
zero  (see  IVible  4).  In  this  case,  the  solution  has  a  large  null  space,  so  maintaining 
a  larger  nullspace  by  deleting  constraints  was  very  effective. 

In  indefinite  QP’s,  the  best  value  of  THETA  is  correlated  with  the  dimension  of 
the  null  space  at  the  solution  in  the  same  way,  but  not  as  strongly.  This  is  due 
to  the  fact  the  QPSFA  cannot  delete  constraints  (in  an  indefinite  QP)  when  the 
current  iterate  violates  any  constraint  in  the  working  set.  Thus,  QPSFA  tends  to 
find  working  sets  that  define  a  much  smaller  dimensional  null  space  when  solving 
indefinite  QP’s  than  when  solving  positive  definite  QP’s. 

In  LP’s,  the  null  space  at  the  solution  can  always  be  viewed  as  having  zero 
dimension.  In  these  problems,  a  THETA  value  of  zero  resulted  in  slightly  fewer  solves 
than  with  THETA  set  to  10*. 

Thus,  the  best  setting  of  THETA  depends  on  the  dimension  of  the  null  space  at 
the  solution.  If  the  dimension  is  near  zero,  set  THETA  to  zero.  If  the  dimension  is 
greater  than  N/4,  set  THETA  to  10*. 
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Table  4 

Number  of  solves  of  the  augmented  system 
when  QPSFA  parameters  are  varied 


Problem 

N 

NZ 

PD  /  THETA  values 

F/0 

F/IO* 

T/0 

T/10« 

1 

10 

0 

25 

33 

18 

34 

2 

10 

0 

31 

31 

20 

30 

3 

10 

0 

25 

25 

19 

18 

4 

10 

8 

19 

19 

15 

15 

5 

20 

15 

35 

33 

37 

18 

6 

50 

0 

40 

34 

35 

38 

7 

20 

5 

65 

57 

42 

18 

8 

20 

12 

30 

16 

30 

10 

g 

20 

20 

7 

7 

7 

5 

10 

10 

10 

3 

2 

3 

2 

The  third  parameter  is  the  integer  variable  ISR.  The  value  of  ISR  specifies 
which  strategy  the  SR  will  use  to  select  the  constraint  to  free,  when  there  are  two 
or  more  suitable  constraints  to  free  (see  Section  4.6.4).  A  constraint  is  suitable 
if  the  corresponding  P  is  positive.  If  ISR  equals  one,  the  suitable  constraint  with 
the  smallest  P  is  freed.  If  ISR  equals  two,  the  suitable  constraint  with  the  most 
negative  multiplier  is  freed.  If  ISR  equals  three,  the  suitable  constraint  with  the 
smallest  ratio  of  multiplier  to  corresponding  p  is  freed. 

In  the  problems  tested,  both  positive  definite  and  indefinite,  using  the  smallest 
P  resulted  in  0-80%  fewer  solves  of  the  augmented  system  than  either  of  the  other 
two  strategies.  (There  were  only  slight  differences  in  the  number  of  solves  between 
the  second  and  third  strategies.)  It  b  not  completely  clear  why  the  strategies  using 
multiplier  information  consistently  perform  less  efficiently  than  the  first  strategy. 
Part  of  the  reason  is  that  the  multipliers  used  in  the  test  correspond  to  the  current 

t 

working  set.  When  the  SR  is  used,  a  constraint  exchange  occurs  and  determines  the 
new  working  set.  The  multipliers  for  this  new  working  set  are  not  readily  available. 
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A  summary  of  the  best  values  for  the  three  parameters  is  given  by  the  following 
table. 


Table  6 

Summary  of  the  best  QPSFA  parameter  settings 


QP  characteristics 

Parameter  settings 

Hessian 

NZ 

PD 

THETA 

ISR 

positive  definite 

>Nf4 

TVue 

10® 

1 

positive  definite 

tts  0 

TVue 

sero 

1 

null  matrix  (LP) 

sero 

Fhlse 

sero 

1 

indefinite 

>N/4 

F^lse 

10® 

1 

indefinite 

«0 

F^lse 

sero 

1 

6.S.  Comparison  of  QPSFA  and  QPSOL 

This  section  will  compare  the  performances  of  QPSOL  and  QPSFA  on  identical 
problems.  The  QPSFA  parameter  values  used  in  the  comparison  are  given  in  Tbble 
3.  The  strategy  used  by  the  SR  (ISR  setting)  is  the  same  in  all  cases.  However, 
the  values  of  PD  and  THETA  vary  with  different  characteristics  of  the  QP.  Sometimes 
these  characteristics  will  not  be  known  in  advance.  However,  in  many  SQP  codes 
the  QP’s  generated  are  known  to  be  positive  definite,  so  PD  can  be  set  true.  In  order 
to  give  THETA  a  value  in  an  SQP,  the  dimension  of  the  null  space  at  the  solution  of 
the  previous  QP  can  be  used. 

QPSFA  has  more  lattitude  than  QPSOL  in  the  way  it  selects  the  initial  working 
set.  The  choice  of  initial  working  set  used  in  the  comparison  is  discussed  in  Section 

5.3.1.  The  next  two  sections  give  results  of  the  comparison  for  positive  definite  QP’s 
and  indefinite  QP’s,  respectively. 

6.3.1.  Initial  working  set  selection.  Given  a  starting  point,  QPSOL  deter¬ 
mines  the  initial  working  set  from  those  constraints  that  are  within  a  small  tolerance 
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of  being  exactly  satisfied.  From  the  same  starting  point,  QPSFA  can  use  these  al¬ 
most  satisfied  constraints,  but  it  could  also  add  violated  (by  more  than  the  small 
tolerance)  general  constraints  to  the  working  set.  In  the  comparison,  QPSFA  does 
not  add  violated  general  constraints.  These  constraints  are  not  added  to  the  initial 
working  set  for  several  reasons.  First,  different  initial  working  sets  would  make  the 
comparison  more  difiicult.  Second,  the  violated  general  constraints  that  QPSFA 
could  add  would  be  added  without  reference  to  gradient  information.  Constraint 
additions  within  the  algorithm  are  made  using  this  information.  Thus,  adding  them 
might  increase  the  number  of  solves  of  the  augmented  system.  Third,  if  NZ  is  large, 
decreasing  the  dimension  of  the  initial  null  space  by  adding  violated  general  con¬ 
straints  might  also  increase  the  number  of  solves.  Finally,  when  test  problems  were 
run  in  which  QPSFA  did  add  violated  general  constraints  to  the  initial  working  set, 
there  was  no  significant  change  in  the  number  of  solves  of  the  augmented  system.  (If 
NZ  is  small,  adding  as  many  violated  general  constraints  as  possible  would  appear 
to  be  a  sensible  strategy.  However,  when  this  strategy  was  used,  there  was  still  no 
significant  change  in  the  number  of  solves  of  the  augmented  system.) 

5.3.2.  Positive  definite  QP’s.  When  QPSFA  and  QPSOL^ere  used  on  positive 
definite  QP’s  with  a  wide  range  of  characteristics,  there  was  usually  at  most  a  20% 
difference  and  no  discernible  pattern  in  the  number  of  solves  of  the  augmented 
system.  (Refer  to  problems  1,  2,  3,  and  6  of  T^ble  6.)  However,  in  two  specific  cases 
this  was  not  true.  These  involve  the  dimension  of  the  null  space  at  the  solution  and 
the  conditioning  of  the  problem. 

The  first  case  involves  QP’s  which  have  a  value  of  NZ  of  approximate  dimension 
N/2.  Most  QP’s  tested  had  a  value  of  N  of  20  or  less.  In  these  problems,  QPSFA 
required  about  50%  fewer  solver,  than  QPSOL.  (Refer  to  problems  4,  5,  and  7 
through  10  of  Table  6.)  In  the  QP’s  tested  with  values  of  N  from  30  to  50,  QPSFA 
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required  about  70%  fewer  solves  than  QPSOL.  (Refer  to  problems  11  and  12  of 
Table  6.)  QPSOL  tended  to  find  a  vertex  in  the  feasibility  phase.  Therefore,  in 
the  optimality  phase,  constraints  had  to  be  deleted  to  increase  the  dimension  of 
the  null  space  back  up  to  N/2.  On  the  other  hand,  QPSFA  did  not  find  a  vertex, 
since  it  can  delete  constraints  when  constraints  in  the  working  set  are  violated. 
Thus,  the  dimension  of  the  null  space  stayed  approximately  constant  during  the 
first  iterations,  and  then  changed  towards  the  final  value  in  the  later  iterations.  In 
problems  with  larger  values  of  N,  we  would  expect  the  percentage  of  fewer  solves  to 
continue  to  increase.  (The  Appendix  contains  output  from  QPSOL  and  QPSFA  on 
a  positive-definite  QP  with  N  =  20  and  NZ  =  5.) 


Table  0 


Comparison  of  the  number  of  solves  of  the 
augmented  system  in  QPSFA  and  QPSOL 
(positive-definite  Hessians) 


The  second  case  where  the  performances  of  QPSOL  and  QPSFA  differ  signif¬ 
icantly  is  when  the  condition  numbers  of  the  Hessian  and  the  constraint  matrix 
decrease.  As  the  condition  numbers  varied,  the  number  of  solves  of  the  augmented 
sytem  required  by  QPSOL  did  not  change  significantly  or  present  a  discernible  pat- 
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tern.  However,  as  the  condition  numbers  decreased  from  10^  to  10^,  there  was  a 
30%  reduction  in  the  number  of  solves  required  by  QPSFA  (see  Table  7). 

This  reduction  can  perhaps  be  explained  by  considering  the  vxtreme  case  where 
the  condition  numbers  are  one  (i.e.,  the  Hessian  is  the  identity  matrix  and  the  only 
constraints  are  bounds).  In  this  case,  a  negative  multiplier  for  a  bound  in  any 
working  set  implies  that  the  bound  is  not  active  at  the  solution.  This  suggests  that 
as  the  condition  numbers  decrease,  the  constraints  deleted  by  QPSFA  are  more 
likely  not  to  be  active  at  the  solution  and  not  to  be  added  again  in  later  iterations. 
On  the  other  hand,  QPSOL  does  not  delete  constraints  on  the  basis  of  multiplier 
information  until  the  current  iterate  is  feasible  and  at  a  manifold  minimum. 

l^ble  7 

Comparison  of  the  number  of  solves  of  the 
augmented  system  when  the  condition  numbers  change 
(positive-definite  Hessians) 


Problem 

COND 

QPSFA 

QPSOL 

1 

7 

34 

30 

1 

2 

18 

32 

2 

7 

32 

34 

2 

2 

28 

26 

3 

7 

34 

30 

3 

2 

26 

30 

The  starting  point  of  most  of  the  tests  was  randomly  chosen.  However,  test 
runs  were  made  to  investigate  the  affect  of  different  starting  points.  The  different 
schemes  used  to  determine  a  starting  point  included  starting  at:  (i)  t  (the  optimal 
solution)  plus  a  step  in  the  direction  of  9(2)  (the  gradient  at  2),  (tt)  2  plus  a  step 
opposite  y(2),  (til)  the  middle  of  the  region  defined  by  the  upper  and  lower  bounds 
of  the  variables,  and  (tv)  the  minimum  of  the  QP  excluding  the  general  constraints. 
In  each  of  these  cases,  the  relative  performance  (in  terms  of  number  of  solves  of  the 
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augmented  system)  of  QPSOL  and  QPSFA  did  not  change  significantly.  However, 
if  we  compare  the  objective  values  at  the  starting  point,  the  first  feasible  point, 
and  the  optimal  point,  QPSFA  and  QPSOL  differ  greatly  in  case  (tw).  In  the  runs 
made  (see  Table  8),  the  initial  objective  value  was  of  order  —10®,  and  the  optimal 
objective  value  was  of  order  —10^.  In  each  case,  the  first  feasible  point  found  by 
QPSFA  was  the  optimal  point.  In  comparison,  the  first  feasible  point  found  by 
QPSOL  had  an  objective  value  of  approximately  order  10®.  Thus,  by  ignoring  the 
Hessian  information  during  the  feasible  phase,  QPSOL  fonnd  an  initial  feasible  point 
with  an  objective  value  significantly  greater  than  both  the  objective  value  at  the 
initial  point  and  the  optimal  objective  point. 


Table  8 

Comparison  of  objective  values  using  the 
unconstrained  minimum  as  the  starting  point 


N 

NZ 

objective  value  at 

initial 

point 

first  feasible 
point — QPSOL 

first  feasible 
point — QPSFA 

10 

0 

-5.9+05 

1.1+05 

-1.7+04 

10 

0 

-3.0+05 

3.7+04 

-1.7+04 

10 

5 

-2.3+05 

1.5+07 

-1.6+04 

10 

0 

-6.0+05 

4.9+03 

-2.1+04 

wsasn 

6.S.S.  Indefinite  QP’a.  When  QPSOL  and  QPSFA  were  used  on  indefinite  QP’s 
with  a  wide  range  of  characteristics,  QPSFA  required  from  —20  to  +100%  more 
solves  of  the  augmented  system  than  QPSOL  (see  Table  9).  Test  problems  with 
a  value  of  NZ  near  N/2  or  with  small  condition  numbers  did  not  yield  significantly 
different  numbers  of  solves,  as  they  did  in  positive  definite  QP’s.  This  was  expected 
since  QPSFA  (when  solving  an  indefinite  QP)  does  not  delete  constraints  from  the 
working  set  if  any  constraint  in  the  working  set  is  violated.  Varying  the  starting 
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point  did  not  result  in  any  significant  change  in  the  relative  performance  of  QPSOL 
and  QPSFA.  (In  many  instances,  no  comparison  was  possible  since  different  local 


minima  were  found.) 


Table  0 


Comparison  of  the  number  of  solves  of  the 
augmented  system  in  QPSFA  and  QPSOL 
(indefinite  Hessians) 


Problem 

N 

NZ 

QPSFA 

QPSOL 

13 

0 

46 

40 

14 

0 

26 

16 

15 

0 

108 

82 

16 

0 

49 

47 

17 

10 

15 

32 

22 

18 

10 

15 

34 

32 

19 

20 

10 

76 

84 

20 

20 

10 

71 

46 

Even  though  QPSFA  takes  more  solves  of  the  augmented  system  than  does  QP¬ 
SOL  (when  solving  indefinite  problems),  QPSFA  may  still  require  less  computation, 
particularly  in  a  large-scale  problem.  This  will  depend  on  what  portion  of  the  total 
work  is  needed  to  form  the  factorizations  of  the  initial  augmented  system.  If  the 
logic  of  QPSOL  were  adapted  to  a  large-scale  problem,  it  would  require  that  one 
additional  factorization  of  K  be  performed,  since  the  factorization  used  in  the  fea¬ 
sibility  phase  cannot  be  used  in  the  optimality  phase.  (Here,  we  are  assuming  that 
successive  augmented  systems  are  being  solved  as  in  Section  3.3.4.)  Thus,  if  this 
factorization  constitutes  a  large  portion  of  the  total  work  and  QPSFA  calculates 
this  factorization  only  a  few  times  (perhaps  once,  but  in  any  event  one  time  fewer 
than  QPSOL),  QPSFA  may  require  less  total  computational  effort,  even  though  it 
needs  more  solves  of  the  augmented  system. 
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This  appendix  describes  the  output  from  QPSOL  and  QPSFA  on  a  positive- 
definite  QP  with  20  variables  (N  =  20),  and  20  general  constraints.  The  QP  has 
five  bounds  and  ten  general  constraints  in  the  active  set.  The  dimension  of  the  null 
space  at  the  solution  is  therefore  five  (NZ  =  5). 

Table  10a  lists  the  output  of  the  feasibility  phase  of  QPSOL.  Thble  106  lists  the 
output  of  the  optimality  phase  of  QPSOL.  Table  11  lists  the  output  of  QPSFA. 

The  column  headings  of  interest  are  now  described.  “ITN"  gives  the  iteration 
number.  An  iteration  includes  at  most  one  deletion  and  one  addition.  The  index 
of  the  constraint  deleted  from  or  added  to  the  working  set  is  given  by  “JDEL" 
and  “JADD”,  respectively.  “STEP*  gives  the  step  length  a,  taken  along  the  search 
direction.  The  value  of  the  quadratic  objective  function  is  given  by  •OBJECTIVE". 
“NCOLZ"  gives  the  number  of  columns  of  the  null  space.  The  Euclidean  norm  of  the 
projected  gradient  is  given  by  “NORM  ZTG".  •COND  T"  and  •COND  ZHZ"  are  estimates 
of  the  condition  numbers  of  the  constraints  in  the  working  set  and  the  projected 
Hessian.  The  number  of  general  constraints  violated  by  the  current  iterate  is  given 
by  "NUMINF".  "SUNINF"  gives  the  sum  of  the  residuals  of  all  violated  constraints. 
“NORM  RES”  gives  the  Euclidean  norm  of  the  residuals  of  all  violated  constraints  in 
the  working  set.  If  the  Singularity  Rule  is  used  due  to  singularity  in  the  constraints, 
a  “T”  would  appear  in  the  column  •SINCT*.  A  single  line  of  output  is  printed  each 
time  a  constraint  is  deleted  or  added,  or  a  unit  step  is  taken. 

Comparing  the  output  will  highlight  some  of  the  important  features  of  QPSFA 
and  differences  between  QPSFA  and  QPSOL.  The  step  in  QPSOL  when  a  constraint 
is  added  is  always  nonzero.  In  QPSFA,  when  a  violated  constraint  is  added,  the 
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step  length  is  zero.  Such  a  zero  step  occurs  five  times  in  Table  11.  The  objective 
value  always  decreases  in  the  optimality  phase  of  QPSOL.  In  QPSFA  the  objec¬ 
tive  value  increases  during  certain  iterations.  This  occurs  because  QPSFA  does  not 
necessarily  find  a  descent  direction  (when  the  current  iterate  is  not  feasible).  A  fea¬ 
sible  vertex  is  determined  by  the  feasibility  phase  of  QPSOL  even  though  the  initial 
point  has  a  null  space  of  dimension  nine.  The  optimality  phase  of  QPSOL  starts  at 
this  feasible  vertex  and  must  build  the  dimension  of  the  null  space  back  up  to  five. 
QPSFA  leaves  NCOLZ  at  nine  or  ten,  until  the  final  iterations  in  which  the  correct 
null  space  of  dimension  five  is  found.  The  norm  of  the  projected  gradient  must 
be  close  to  zero  before  QPSOL  allows  a  constraint  deletion.  Since  QPSFA  allows 
constraint  deletions  before  the  norm  of  the  projected  gradient  vanishes,  this  norm 
is  close  to  zero  only  on  the  last  iteration.  A  feasible  point  (NUNINF  =  0)  is  found  by 
QPSFA  on  the  last  iteration.  (This  occured  in  approximately  20%  of  the  test  runs.) 
The  sum  of  the  infeasibilities  is  decreasing  in  both  the  feasibility  phase  of  QPSOL 
and  in  QPSFA.  (This  decrease  is  used  to  show  convergence  of  these  algorithms.) 
The  Singularity  Rule  was  not  needed  by  QPSFA  for  this  problem.  It  was  needed 
only  several  times  during  all  of  the  tests  with  positive-definite  Hessians  when  con¬ 
straint  deletions  were  allowed  before  the  current  iterate  satisfied  the  working  set. 
In  indefinite  problems,  the  Singularity  Rule  was  used  more  frequently. 


Solution  of  a  QP  problem  using  QPSOL 
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Table  11  no 


zeeoooeeooeooeoeeoeooeeooeoeoeeee 

pm  III 

OOOOOOppppOpOOOpOQOfiOOOOODOOOOOO 


toeeoeeooooooeeeoeeeoeeeeeoeeo^ 

I  I  I 

lOOOOSOOOOOOpppOOOAOOOpOOOOOOOO 

>«MrsirwrjrursjiAi04nm^»KKr^oAi/iiA<«^cD^^iAina>o*>«»^ 


M  M  «si  • 
jtoo 


I  o  o  o  e  e  o  I 


»eoeoooooeei 


ooeeoeopoepooeooppoDDpDeooDpoD 


£  o  o 
O  e  o 


oeooeeeeeoeeeoooe04»ooeeN»>^^*»MM 

eeoeooQOooooooooeoeoeeeooeoooo 

fioooeeofioofioooeooppppDOoeooooo 


eeooeeoeooi 


ooooeoeoopoeooeeoi 


ooopoppoeeoooopoooopoope&oooopoo 

erPP<p«wK^^^*.«»iA<^p^fftppoepp«vu>Kf^Kfi>ir>r« 


*mooopooppopop( 


'eoooooooeeooeoepo 


^ooeeooooapoooopepoooooooopooooQo 

tfw>bn4nminkn<r'«^Kini/^^»^<7peiApp^po7e^«»«B«»»^^^ 


^eeooeopoooopoeoooopooeeeoeeeeoe 

igeoeooooQooeeeooopoeeoepoeoeeep 


I  »  •  I  I  »  I 


•  f  •  I  •  I  I  I  I  I 


•ftieeoeeeeeoeooeoeoooooooooooeeeooe 

**  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  t  I  I  I  I  I  I  I  I  4  I  I 

*^5^$^ocoooooooooopopoooooPDOoeooe 

eeoeeo»ooeeo«woMO^e^eeo<f*o*^oi'»e*wo 

c>eeoee^»eeooer«»eiAooor>ooo^r«f^^eiAOM^o 

eeoeoeMeeeoer>er«eVovoeo«*^ii!^«o«>o»«r«r 
^  ^  p-#-j-j-/prpp  wps^  ml  3-/ 

Poe*6e«weooKior^o«opo«e^o«no»>^«9-^>o^eoee 

n 

^e^o^eoor^e^erfeiner>ep^o«Pw>oeeer^o*eee 

C  ^  ^  »  w»  ^ 

-> 


>4r«lAlA<«^N^>pP^POO•»M^^«ll^l^^Kp 


TABLE  11 

Solution  of  a  QP  problem  nsing  QPSFA 


